RE: [PATCH] tpm: do handle area size validation only when TPM space used

2017-03-28 Thread Alexander.Steffen
Yes, this fixes the issue for me. Thanks.

Alexander

> -Original Message-
> From: Jarkko Sakkinen [mailto:jarkko.sakki...@linux.intel.com]
> Sent: Tuesday, March 28, 2017 12:25 PM
> To: Steffen Alexander (IFAG CCS ESS D SW A)
> Cc: tpmdd-de...@lists.sourceforge.net; linux-security-
> mod...@vger.kernel.org; Peter Huewe; Marcel Selhorst; Jason Gunthorpe;
> open list
> Subject: Re: [PATCH] tpm: do handle area size validation only when TPM
> space used
> 
> So do you need this or not?
> 
> /Jarkko
> 
> On Mon, Mar 27, 2017 at 12:08:15AM +0300, Jarkko Sakkinen wrote:
> > In order to not cause backwards compatibility issues with
> > /dev/tpm0 disable handle area size validation if tpm_transmit is not
> > called with a TPM space.
> >
> > Signed-off-by: Jarkko Sakkinen 
> > ---
> >  drivers/char/tpm/tpm-interface.c | 9 +++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/char/tpm/tpm-interface.c
> > b/drivers/char/tpm/tpm-interface.c
> > index bf0c3fa..158c1db 100644
> > --- a/drivers/char/tpm/tpm-interface.c
> > +++ b/drivers/char/tpm/tpm-interface.c
> > @@ -328,7 +328,9 @@ unsigned long tpm_calc_ordinal_duration(struct
> > tpm_chip *chip,  }  EXPORT_SYMBOL_GPL(tpm_calc_ordinal_duration);
> >
> > -static bool tpm_validate_command(struct tpm_chip *chip, const u8
> > *cmd,
> > +static bool tpm_validate_command(struct tpm_chip *chip,
> > +struct tpm_space *space,
> > +const u8 *cmd,
> >  size_t len)
> >  {
> > const struct tpm_input_header *header = (const void *)cmd; @@ -
> 340,6
> > +342,9 @@ static bool tpm_validate_command(struct tpm_chip *chip,
> const u8 *cmd,
> > if (len < TPM_HEADER_SIZE)
> > return false;
> >
> > +   if (!space)
> > +   return true;
> > +
> > if (chip->flags & TPM_CHIP_FLAG_TPM2 && chip->nr_commands) {
> > cc = be32_to_cpu(header->ordinal);
> >
> > @@ -386,7 +391,7 @@ ssize_t tpm_transmit(struct tpm_chip *chip, struct
> tpm_space *space,
> > unsigned long stop;
> > bool need_locality;
> >
> > -   if (!tpm_validate_command(chip, buf, bufsiz))
> > +   if (!tpm_validate_command(chip, space, buf, bufsiz))
> > return -EINVAL;
> >
> > if (bufsiz > TPM_BUFSIZE)
> > --
> > 2.9.3
> >



RE: [tpmdd-devel] [PATCH v3 2/7] tpm: validate TPM 2.0 commands

2017-03-17 Thread Alexander.Steffen
> Check for every TPM 2.0 command that the command code is supported and
> the command buffer has at least the length that can contain the header
> and the handle area.

This breaks several use cases for me:

1. I've got a TPM that implements vendor-specific command codes. Those cannot 
be send to the TPM anymore, but are rejected with EINVAL.

2. When upgrading the firmware on my TPM, it switches to a non-standard 
communication mode for the upgrade process and does not communicate using 
TPM2.0 commands during this time. Rejecting non-TPM2.0 commands means upgrading 
won't be possible anymore.

3. I'd like to use the kernel driver to test my TPM implementation. So for 
example, I send an invalid command code to the TPM and expect 
TPM_RC_COMMAND_CODE in response, but now I get EINVAL instead and the TPM never 
sees the command.

From my point of view, the kernel driver should provide a transparent 
communication channel to the TPM. Whatever I write to /dev/tpm should arrive 
at the TPM device, so that the TPM can handle it and return the appropriate 
response. Otherwise, you'll end up reimplementing all the command handling 
logic, that is already part of the TPM's job, and as soon as you miss one case 
and behave differently than the TPM, something relying on this behavior will 
break.

I see two possible solutions:

1. When the driver does not know a command code, it passes through the command 
unmodified. This bears the risk of unknown side effects though, so TPM spaces 
might not be as independent as they should be.

2. Since the command code lookup is only really necessary for TPM spaces, it 
only gets activated when space != NULL. So the change will not affect 
/dev/tpm, but only the new /dev/tpmrm. As /dev/tpmrm is not meant to 
be a transparent interface anyway, rejecting unknown commands is acceptable.

Alexander


RE: [tpmdd-devel] [PATCH v3 4/7] tpm: infrastructure for TPM spaces

2017-03-17 Thread Alexander.Steffen
> diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-
> interface.c
> index 20b1fe3..db5ffe9 100644
> --- a/drivers/char/tpm/tpm-interface.c
> +++ b/drivers/char/tpm/tpm-interface.c
> @@ -376,11 +376,12 @@ static bool tpm_validate_command(struct
> tpm_chip *chip, const u8 *cmd,
>   * 0 when the operation is successful.
>   * A negative number for system errors (errno).
>   */
> -ssize_t tpm_transmit(struct tpm_chip *chip, const u8 *buf, size_t bufsiz,
> -  unsigned int flags)
> +ssize_t tpm_transmit(struct tpm_chip *chip, struct tpm_space *space,
> +  u8 *buf, size_t bufsiz, unsigned int flags)

When adding parameters, please also update the parameter documentation at the 
top of the function. It is missing for the new parameter "space" here.

Alexander


RE: [tpmdd-devel] [PATCH v3 2/7] tpm: validate TPM 2.0 commands

2017-03-20 Thread Alexander.Steffen
> I think the most straight-forward way to sort this out would be to limit
> validation to the resource manager.

Sounds good to me.

> If I send a fix, would you care to test it?

Sure, will do.

Alexander


RE: [tpmdd-devel] [PATCH v3 2/7] tpm: validate TPM 2.0 commands

2017-03-20 Thread Alexander.Steffen
>>> 2. When upgrading the firmware on my TPM, it switches to a
>>> non-standard communication mode for the upgrade process and does not
>>> communicate using TPM2.0 commands during this time. Rejecting
>>> non-TPM2.0 commands means upgrading won't be possible anymore.
>>
>> How non standard? Is the basic header even there? Are the lengths and
>> status code right?
>>
>> This might be an argument to add a 'raw' ioctl or something specifically
>> for this special case.
>
> It follows the regular TPM command syntax and looks something like 1.2
> commands.

Yep, so most of it already works with the current implementation.

There are a few special cases that need some thought though. For example, it is 
possible to use an upgrade to switch the TPM family from 1.2 to 2.0 (or vice 
versa). In this case it seems useful to let the kernel reinitialize the TPM 
driver, so it uses the correct timeouts for communication, activates the 
correct features (resource manager or not?), etc., without needing to reboot 
the system.

Another problem arises when the upgrade process is interrupted, e.g. because 
power is lost. Then the TPM is stuck in its non-standard upgrade mode, so the 
kernel does not recognize it as a valid TPM device and does not export 
/dev/tpm. But without the device node the upgrader is unable to restart the 
upgrade process, leaving the TPM forever inaccessible.

I'll try to work on those problems in the coming weeks and provide the fixes. 
Any input is appreciated.

Alexander


RE: [tpmdd-devel] [PATCH v3 2/7] tpm: validate TPM 2.0 commands

2017-03-21 Thread Alexander.Steffen
> > There are a few special cases that need some thought though. For
> > example, it is possible to use an upgrade to switch the TPM family
> > from 1.2 to 2.0 (or vice versa). In this case it seems useful to let
> > the kernel reinitialize the TPM driver, so it uses the correct
> > timeouts for communication, activates the correct features (resource
> > manager or not?), etc., without needing to reboot the system.
> 
> In practice, would a TPM upgrade from TPM 1.2 to TPM 2.0 even occur
> without a reboot?  Is it an important use case?
> 
> 1 - It would leave the SHA-256 PCRs in the reset state.
> 
> 2 - It's possible that this upgrade would also require a BIOS upgrade.

For a traditional PC and when your goal is platform integrity, a reboot is 
probably the way to go. But in an embedded environment where there is no BIOS 
or if you use the TPM more like a smartcard just to store some keys (or 
generate random numbers), a reboot is unnecessary and it is more comfortable to 
avoid it.

We probably should inform the kernel before the upgrade anyway, so that it can 
shut down the TPM gracefully (and maybe switch to the upgrade mode, as Jason 
suggested). With that infrastructure in place, it does not seem like a lot of 
effort to also let it switch the TPM back to normal operation mode once the 
upgrade is complete.

Alexander


RE: [PATCH 3/4] char/tpm: Improve a size determination in nine functions

2017-10-17 Thread Alexander.Steffen
> > Replace the specification of data structures by pointer dereferences
> > as the parameter for the operator "sizeof" to make the corresponding
> > size
> > determination a bit safer according to the Linux coding style
> > convention.
> 
> 
> This patch does one style in favor of the other.

I actually prefer that style, so I'd welcome this change :)

> At the end it's Jarkko's call, though I would NAK this as I think some
> one already told this to you for some other similar patch(es).
> 
> 
> I even would suggest to stop doing this noisy stuff, which keeps people
> busy for nothing.

Cleaning up old code is also worth something, even if does not change one bit 
in the assembly output in the end...

Alexander


RE: [PATCH 3/4] char/tpm: Improve a size determination in nine functions

2017-10-17 Thread Alexander.Steffen
> On Tue, 2017-10-17 at 11:50 +, alexander.stef...@infineon.com
> wrote:
> > > > Replace the specification of data structures by pointer dereferences
> > > > as the parameter for the operator "sizeof" to make the corresponding
> > > > size
> > > > determination a bit safer according to the Linux coding style
> > > > convention.
> > >
> > >
> > > This patch does one style in favor of the other.
> >
> > I actually prefer that style, so I'd welcome this change :)
> 
> Style changes should be reviewed and documented, like any other code
> change, and added to Documentation/process/coding-style.rst or an
> equivalent file.
> 
> > > At the end it's Jarkko's call, though I would NAK this as I think some
> > > one already told this to you for some other similar patch(es).
> > >
> > >
> > > I even would suggest to stop doing this noisy stuff, which keeps people
> > > busy for nothing.
> >
> > Cleaning up old code is also worth something, even if does not
> > change one bit in the assembly output in the end...
> 
> Wow, you're opening the door really wide for all sorts of trivial
> changes!  Hope you have the time and inclination to review and comment
> on all of them.  I certainly don't.

Well, isn't the point of trivial changes that they are trivial to review? :) 
For things like that there is probably not even a need to run a test, though 
with sufficient automation that should not be a problem either.

> There is a major difference between adding these sorts of checks to
> the tools in the scripts directory or even to the zero day bots that
> catch different sorts of errors, BEFORE code is upstreamed, and
> patches like these, after the fact.

Catching those things early in the process is certainly preferable. But at some 
point you need to fix the existing code, or you'll end up with a mashup of 
different styles, just because you did not want to touch old code.

> After the code has been upstreamed, it is a lot more difficult to
> justify changes like this.  It impacts both code that is being
> developed AND backporting bug fixes.

Backporting could be an argument, but even that should not be allowed to block 
improvements indefinitely. I'd prefer a world in which the current code is nice 
and clean and easy to maintain, to a world where we never touch old code unless 
it is proven to be wrong.

But looking at the code in question, I cannot see how this should ever be a 
serious problem. Even when backporting a change takes now ten minutes instead 
of five, which means it is twice as hard, it is still not difficult.

Alexander


RE: [PATCH v2] tpm-dev-common: Reject too short writes

2017-09-08 Thread Alexander.Steffen
> On Wed, Sep 06, 2017 at 02:19:28PM +,
> alexander.stef...@infineon.com wrote:
> > > On Wed, Sep 06, 2017 at 03:42:33PM +0300, Jarkko Sakkinen wrote:
> > > > On Mon, Sep 04, 2017 at 07:36:42PM +0200, Alexander Steffen wrote:
> > > > > tpm_transmit() does not offer an explicit interface to indicate the
> > > > > number of valid bytes in the communication buffer. Instead, it
> > > > > relies on the commandSize field in the TPM header that is encoded
> within
> > > the buffer.
> > > > > Therefore, ensure that a) enough data has been written to the
> > > > > buffer, so that the commandSize field is present and b) the
> > > > > commandSize field does not announce more data than has been
> written
> > > to the buffer.
> > > > >
> > > > > This should have been fixed with CVE-2011-1161 long ago, but
> > > > > apparently a correct version of that patch never made it into the
> kernel.
> > > > >
> > > > > Cc: sta...@vger.kernel.org
> > > > > Signed-off-by: Alexander Steffen
> 
> > > > > ---
> > > > > v2:
> > > > > - Moved all changes to tpm_common_write in a single patch.
> > > > >
> > > > >  drivers/char/tpm/tpm-dev-common.c | 3 ++-
> > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/char/tpm/tpm-dev-common.c
> > > > > b/drivers/char/tpm/tpm-dev-common.c
> > > > > index 610638a..ac25574 100644
> > > > > --- a/drivers/char/tpm/tpm-dev-common.c
> > > > > +++ b/drivers/char/tpm/tpm-dev-common.c
> > > > > @@ -99,7 +99,8 @@ ssize_t tpm_common_write(struct file *file,
> const
> > > char __user *buf,
> > > > >   if (atomic_read(>data_pending) != 0)
> > > > >   return -EBUSY;
> > > > >
> > > > > - if (in_size > TPM_BUFSIZE)
> > > > > + if (in_size > sizeof(priv->data_buffer) || in_size < 6 ||
> > > > > + in_size < be32_to_cpu(*((__be32 *) (buf + 2
> > > > >   return -E2BIG;
> > > > >
> > > > >   mutex_lock(>buffer_mutex);
> > > > > --
> > > > > 2.7.4
> > > > >
> > > >
> > > > How did you test this change after you implemented it?
> > > >
> > > > Just thinking what to add to
> > > > https://github.com/jsakkine-intel/tpm2-scripts
> >
> > I already had test cases that failed with some of my TPMs under some
> circumstances. I'll try to come up with a concise description of what those
> tests do or send you a patch directly for your tests. GitHub pull requests are
> okay for that repository? (I already have one waiting there.)
> >
> > > >
> > > > /Jarkko
> > >
> > > Just when I started to implement this that the bug fix itself does not 
> > > have
> yet
> > > the right semantics.
> > >
> > > It should be just add a new check:
> > >
> > > if (in_size != be32_to_cpu(*((__be32 *) (buf + 2
> > >   return -EINVAL;
> > >
> > > The existing check is correct. This was missing. The reason for this is 
> > > that
> we
> > > process whatever is in the in_size bytes as a full command.
> >
> > There are two problems with this solution:
> >
> > 1. You need to check for in_size < 6, otherwise you read data that has
> > not been written there during that request. I haven't tested this, but
> > I'd expect it to fail for example if you try to send the two commands
> > "00 00 00 00 00 02" and "00 00". The first will be rejected with
> > EINVAL, because 6 (in_size) != 2 (commandSize). But the second will
> > pass that check, because now in_size happens to match the commandSize
> > that has only been written to the buffer for the first command.
> 
> AFAIK tpm_transmit checks that the command has at least the header.

This was only a simple example, it will fail with other values as well. Just 
add 8 to both size fields and append 8 null bytes, and you will pass the length 
check in tpm_transmit but still have incorrect data. Also, it is probably no 
good style to omit checks for obvious errors, just because an unrelated check 
in a completely different location also happens to catch the problem under some 
circumstances. tpm_common_write discards the relevant information (in_size), so 
all other parts of the code need to be able to rely on it to have validated it 
properly.

> > 2. You may not reject commands where in_size > commandSize, because
> > TIS/PTP require the TPM to throw away extra bytes (and process the
> > command as usual), not fail the command. You can see that in the State
> > Transition Table (Table 18 in TIS 1.3), line 20, with the TPM in
> > Reception state and Expect=0, writing more data does not change the
> > state ("Write is not expected. Drop write. TPM ignores this state
> > transition."). Of course, since we do not pass on in_size, but only
> > commandSize the TPM will never see those extra bytes, but the external
> > behavior (for user space applications) is identical.
> 
> OK, this is more relevant. What is the legit case to send extra bytes?

For me: testing that my TPM implementation behaves correctly :) I can run the 
same test cases against the TPM using the Linux driver and 

RE: [PATCH v3] tpm-dev-common: Reject too short writes

2017-09-11 Thread Alexander.Steffen
> On Sat, Sep 09, 2017 at 12:37:39AM +0300, Jarkko Sakkinen wrote:
> > On Fri, Sep 08, 2017 at 05:21:32PM +0200, Alexander Steffen wrote:
> > > tpm_transmit() does not offer an explicit interface to indicate the
> number
> > > of valid bytes in the communication buffer. Instead, it relies on the
> > > commandSize field in the TPM header that is encoded within the buffer.
> > > Therefore, ensure that a) enough data has been written to the buffer, so
> > > that the commandSize field is present and b) the commandSize field does
> not
> > > announce more data than has been written to the buffer.
> > >
> > > This should have been fixed with CVE-2011-1161 long ago, but apparently
> > > a correct version of that patch never made it into the kernel.
> > >
> > > Cc: sta...@vger.kernel.org
> > > Signed-off-by: Alexander Steffen 
> > > ---
> > > v2:
> > > - Moved all changes to tpm_common_write in a single patch.
> > > v3:
> > > - Access data copied from user space (priv->data_buffer) instead of user
> > >   space data directly (buf).
> > > - Changed return code to EINVAL.
> > >
> > >  drivers/char/tpm/tpm-dev-common.c | 6 ++
> > >  1 file changed, 6 insertions(+)
> > >
> > > diff --git a/drivers/char/tpm/tpm-dev-common.c
> b/drivers/char/tpm/tpm-dev-common.c
> > > index 610638a..461bf0b 100644
> > > --- a/drivers/char/tpm/tpm-dev-common.c
> > > +++ b/drivers/char/tpm/tpm-dev-common.c
> > > @@ -110,6 +110,12 @@ ssize_t tpm_common_write(struct file *file,
> const char __user *buf,
> > >   return -EFAULT;
> > >   }
> > >
> > > + if (in_size < 6 ||
> > > + in_size < be32_to_cpu(*((__be32 *) (priv->data_buffer + 2 {
> > > + mutex_unlock(>buffer_mutex);
> > > + return -EINVAL;
> > > + }
> > > +
> > >   /* atomic tpm command send and result receive. We only hold the
> ops
> > >* lock during this period so that the tpm can be unregistered even if
> > >* the char dev is held open.
> > > --
> > > 2.7.4
> > >
> >
> > I'm not gonna fight about that "in_size < 6" check. I think it is not
> > needed, I understand your point but still disagree but it is something
> > where I can live with having it.
> >
> > I kind of disagree also with allowing messages longer than the command
> > size but it does not have to be in the scope of this commit and actually
> > should be a separate discussion if we ever going to do something about
> > it.
> >
> > Thanks for the patience!
> >
> > Reviewed-by: Jarkko Sakkinen 
> >
> > /Jarkko
> 
> Without your fix:
> 
> $ python -m unittest -v tpm2_smoke.SmokeTest.test_too_short_cmd
> test_too_short_cmd (tpm2_smoke.SmokeTest) ... FAIL
> 
> ==
> 
> FAIL: test_too_short_cmd (tpm2_smoke.SmokeTest)
> --
> Traceback (most recent call last):
>   File "tpm2_smoke.py", line 157, in test_too_short_cmd
> self.assertEqual(rejected, True)
> AssertionError: False != True
> 
> --
> Ran 1 test in 2.108s
> 
> FAILED (failures=1)
> 
> The test case expects to get a posix error, which it doesn't get.
> 
> With your fix:
> 
> $ python -m unittest -v tpm2_smoke.SmokeTest.test_too_short_cmd
> test_too_short_cmd (tpm2_smoke.SmokeTest) ... ok
> 
> --
> Ran 1 test in 2.099s
> 
> OK
> 
> So looks good to me.
> 
> Tested-by: Jarkko Sakkinen 
> 
> Can you test the master branch with SPI TPM? I had to tinker your
> commits a bit because of merge conflicts with Arnd's commit. I'll
> put everything back to next as soon as I hear from you. Thanks
> 
> /Jarkko

tpm_tis_spi in master (3897f7c) is broken, it does not transfer any data. I'll 
send you a fixed patch for the DMA change.

Alexander


RE: [tpmdd-devel] [PATCH v2 0/4] additional TPM performance improvements

2017-09-11 Thread Alexander.Steffen
> After further discussions with the Device Driver working group (ddwg),
> the following changes were made:
> 
> * Check for burstcount at least once to confirm the TPM is ready to accept
> the data. Similarly, query for the TPM Expect status as sanity check at
> the end.
> 
> * Make the sleep for status check during send() in the loop less than
> 5msec.
> 
> * Make the sleep in the loop while querying for burstcount less than
> 5msec.
> 
> Below is the list of patches along with the performance improvements
> seen with a TPM 1.2 with an 8 byte burstcount for 1000 extends:
> 
> Patch|Improvement(time in sec)
> 
> tpm: ignore burstcount to improve tpm_tis| ~41 - ~14
> send() performance.
> 
> tpm: define __wait_for_tpm_stat to specify   | ~14 - ~10
> variable polling sleep time
> 
> tpm: reduce tpm_msleep() time in | ~10 - ~9
> get_burstcount()
> 
> tpm: modify tpm_msleep() function to have| ~9 - ~8
> max range
> 
> Changelog v2:
> 
> * Add module parameter to handle ignoring of burst count during
> tpm tis send() operation.
> * Add improvements over sleep time to reduce delays.
> 
> Nayna Jain (4):
>   tpm: ignore burstcount to improve tpm_tis send() performance.
>   tpm: define __wait_for_tpm_stat to specify variable polling sleep time
>   tpm: reduce tpm_msleep() time in get_burstcount()
>   tpm: use tpm_msleep() value as max delay
> 
>  Documentation/admin-guide/kernel-parameters.txt |  8 ++
>  drivers/char/tpm/tpm-interface.c| 15 --
>  drivers/char/tpm/tpm.h  |  7 +++--
>  drivers/char/tpm/tpm_tis_core.c | 37 
> +++--
>  4 files changed, 53 insertions(+), 14 deletions(-)
> 
> --
> 2.13.3

I haven't looked at the changes in detail yet, but some initial comments:

The ignore_burst_count functionality has already received some negative 
feedback and probably needs more iterations, if it can be accepted at all, so 
you might want to split it off from the rest of the series.

Also, did you explore any alternative ways to activate that functionality 
besides a command line parameter? If it is off by default, then most users will 
never see the benefits, and if they do activate it manually, then they might 
hit some obscure bugs because those code paths receive only little testing. I 
could imagine for example activating this functionality automatically for 
certain safe drivers like tpm_tis_spi, where wait states cannot block us (or 
anyone else on the bus) indefinitely.

I ran all your patches through my tests (though without making any changes, so 
ignore_burst_count was off, since that is the default) and did not see any 
failures. My test bench does not contain performance tests (yet), so I cannot 
confirm your measurements. checkpatch.pl has a couple of complaints though.

Alexander


RE: [PATCH v2] tpm-dev-common: Reject too short writes

2017-09-06 Thread Alexander.Steffen
> On Wed, Sep 06, 2017 at 03:42:33PM +0300, Jarkko Sakkinen wrote:
> > On Mon, Sep 04, 2017 at 07:36:42PM +0200, Alexander Steffen wrote:
> > > tpm_transmit() does not offer an explicit interface to indicate the
> > > number of valid bytes in the communication buffer. Instead, it
> > > relies on the commandSize field in the TPM header that is encoded within
> the buffer.
> > > Therefore, ensure that a) enough data has been written to the
> > > buffer, so that the commandSize field is present and b) the
> > > commandSize field does not announce more data than has been written
> to the buffer.
> > >
> > > This should have been fixed with CVE-2011-1161 long ago, but
> > > apparently a correct version of that patch never made it into the kernel.
> > >
> > > Cc: sta...@vger.kernel.org
> > > Signed-off-by: Alexander Steffen 
> > > ---
> > > v2:
> > > - Moved all changes to tpm_common_write in a single patch.
> > >
> > >  drivers/char/tpm/tpm-dev-common.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/char/tpm/tpm-dev-common.c
> > > b/drivers/char/tpm/tpm-dev-common.c
> > > index 610638a..ac25574 100644
> > > --- a/drivers/char/tpm/tpm-dev-common.c
> > > +++ b/drivers/char/tpm/tpm-dev-common.c
> > > @@ -99,7 +99,8 @@ ssize_t tpm_common_write(struct file *file, const
> char __user *buf,
> > >   if (atomic_read(>data_pending) != 0)
> > >   return -EBUSY;
> > >
> > > - if (in_size > TPM_BUFSIZE)
> > > + if (in_size > sizeof(priv->data_buffer) || in_size < 6 ||
> > > + in_size < be32_to_cpu(*((__be32 *) (buf + 2
> > >   return -E2BIG;
> > >
> > >   mutex_lock(>buffer_mutex);
> > > --
> > > 2.7.4
> > >
> >
> > How did you test this change after you implemented it?
> >
> > Just thinking what to add to
> > https://github.com/jsakkine-intel/tpm2-scripts

I already had test cases that failed with some of my TPMs under some 
circumstances. I'll try to come up with a concise description of what those 
tests do or send you a patch directly for your tests. GitHub pull requests are 
okay for that repository? (I already have one waiting there.)

> >
> > /Jarkko
> 
> Just when I started to implement this that the bug fix itself does not have 
> yet
> the right semantics.
> 
> It should be just add a new check:
> 
> if (in_size != be32_to_cpu(*((__be32 *) (buf + 2
>   return -EINVAL;
> 
> The existing check is correct. This was missing. The reason for this is that 
> we
> process whatever is in the in_size bytes as a full command.

There are two problems with this solution:

1. You need to check for in_size < 6, otherwise you read data that has not been 
written there during that request. I haven't tested this, but I'd expect it to 
fail for example if you try to send the two commands "00 00 00 00 00 02" and 
"00 00". The first will be rejected with EINVAL, because 6 (in_size) != 2 
(commandSize). But the second will pass that check, because now in_size happens 
to match the commandSize that has only been written to the buffer for the first 
command.

2. You may not reject commands where in_size > commandSize, because TIS/PTP 
require the TPM to throw away extra bytes (and process the command as usual), 
not fail the command. You can see that in the State Transition Table (Table 18 
in TIS 1.3), line 20, with the TPM in Reception state and Expect=0, writing 
more data does not change the state ("Write is not expected. Drop write. TPM 
ignores this state transition."). Of course, since we do not pass on in_size, 
but only commandSize the TPM will never see those extra bytes, but the external 
behavior (for user space applications) is identical.

When you fix those two problems, you're probably back at my solution. The only 
other change I made (renaming TPM_BUFSIZE to sizeof(priv->data_buffer)) does 
not change anything, the values are identical. I just find it very confusing 
when you compare something against a magic constant to avoid buffer overflows 
in your memcpy. Using the buffer size directly is much more straightforward 
(and less prone to fail as soon as someone changes the structure definition).

> 
> Sorry I didn't notice before I started to implement a test case.
> 
> /Jarkko

Alexander


RE: char-TPM: Adjustments for ten function implementations

2017-10-18 Thread Alexander.Steffen
> On Wed, 2017-10-18 at 11:00 +0200, SF Markus Elfring wrote:
> > > The printk removals do change the objects.
> > >
> > > The value of that type of change is only for resource limited systems.
> >
> > I imagine that such small code adjustments are also useful for other
> systems.
> 
> Your imagination and mine differ.
> Where do you _think_ it matters?
> 
> For instance, nothing about
> 
>   sizeof(type)
> vs
>   sizeof(*ptr)
> 
> makes it easier for a human to read the code.

If it does not make it easier to read the code for you, then maybe you should 
consider that this might not be true for all humans. For me, it makes it much 
easier to see at a glance, that code like ptr=malloc(sizeof(*ptr)) is correct.

Alexander


RE: char-TPM: Adjustments for ten function implementations

2017-10-18 Thread Alexander.Steffen
> On Wed, 2017-10-18 at 10:44 +, alexander.stef...@infineon.com wrote:
> > > For instance, nothing about
> > > > >   sizeof(type)
> > > > > vs
> > > > >   sizeof(*ptr)
> > > > > makes it easier for a human to read the code.
> > > >
> > > > If it does not make it easier to read the code for you, then maybe you
> > > > should consider that this might not be true for all humans. For me, it
> > > > makes it much easier to see at a glance, that code like
> > > > ptr=malloc(sizeof(*ptr)) is correct.
> > >
> > > I don't think there is a perfect solution.
> >
> > Maybe. But for the second variant the correctness is easier to check,
> 
> How often should
>   ptr = alloc(sizeof(*ptr))
> be
>   ptr = alloc(sizeof(**ptr))

Never? Because in that case it probably should be *ptr=alloc(sizeof(**ptr)), 
unless you are doing something horrible ;-)

Alexander


RE: char-TPM: Adjustments for ten function implementations

2017-10-18 Thread Alexander.Steffen
> On Wed, 18 Oct 2017, alexander.stef...@infineon.com wrote:
> 
> > > On Wed, 2017-10-18 at 11:00 +0200, SF Markus Elfring wrote:
> > > > > The printk removals do change the objects.
> > > > >
> > > > > The value of that type of change is only for resource limited systems.
> > > >
> > > > I imagine that such small code adjustments are also useful for other
> > > systems.
> > >
> > > Your imagination and mine differ.
> > > Where do you _think_ it matters?
> > >
> > > For instance, nothing about
> > >
> > >   sizeof(type)
> > > vs
> > >   sizeof(*ptr)
> > >
> > > makes it easier for a human to read the code.
> >
> > If it does not make it easier to read the code for you, then maybe you
> > should consider that this might not be true for all humans. For me, it
> > makes it much easier to see at a glance, that code like
> > ptr=malloc(sizeof(*ptr)) is correct.
> 
> I don't think there is a perfect solution.

Maybe. But for the second variant the correctness is easier to check, both 
mentally and programmatically, because there is no need for any context (the 
type of ptr does not matter).

> The type argument to sizeof
> could have the wrong type.  The expression argument to sizeof could be
> missing the *.  Unpleasant consequences are possible in both cases.
> Probably each maintainer has a style they prefer.  Perhaps it could be
> useful to adjust the code to follow the dominant strategy, in cases where
> there are a inconsistencies.

Certainly. At least within a file, there should be only one style.

> For example
> 
> if (...)
>   x = foo1(sizeof(struct xtype));
> else
>   x = foo2(sizeof(*x));
> 
> might at least cause some unnecessary mental effort to process.
> 
> julia

Alexander


RE: [PATCH 3/4] char/tpm: Improve a size determination in nine functions

2017-10-18 Thread Alexander.Steffen
> On Tue, 17 Oct 2017, Mimi Zohar wrote:
> 
> > On Tue, 2017-10-17 at 14:58 +0200, Julia Lawall wrote:
> > >
> > > On Tue, 17 Oct 2017, Mimi Zohar wrote:
> > >
> > > > On Tue, 2017-10-17 at 11:50 +, alexander.stef...@infineon.com
> > > > wrote:
> > > > > > > Replace the specification of data structures by pointer
> dereferences
> > > > > > > as the parameter for the operator "sizeof" to make the
> corresponding
> > > > > > > size
> > > > > > > determination a bit safer according to the Linux coding style
> > > > > > > convention.
> > > > > >
> > > > > >
> > > > > > This patch does one style in favor of the other.
> > > > >
> > > > > I actually prefer that style, so I'd welcome this change :)
> > > >
> > > > Style changes should be reviewed and documented, like any other
> code
> > > > change, and added to Documentation/process/coding-style.rst or an
> > > > equivalent file.
> > >
> > > Actually, it has been there for many years:
> > >
> > > 14) Allocating memory
> > > -
> > > ...
> > > The preferred form for passing a size of a struct is the following:
> > >
> > > .. code-block:: c
> > >
> > >   p = kmalloc(sizeof(*p), ...);
> > >
> > > The alternative form where struct name is spelled out hurts readability
> and
> > > introduces an opportunity for a bug when the pointer variable type is
> changed
> > > but the corresponding sizeof that is passed to a memory allocator is not.
> >
> > True, thanks for the reminder.  Is this common in new code?  Is there
> > a script/ or some other automated way of catching this usage before
> > patches are upstreamed?
> >
> > Just as you're doing here, the patch description should reference this
> > in the patch description.
> 
> The comment in the documentation seems have been there since Linux
> 2.6.14,
> ie 2005.  The fact that a lot of code still doesn't use that style, 12
> years later, suggests that actually it is not preferred, or not preferred
> by everyone.  Perhaps the paragraph in coding style should just be
> dropped.

Or maybe everyone just copied existing code, which did not follow that pattern, 
because nobody bothered to fix old code ;-)

(This is true at least for tpm_tis_spi, where I was involved in its creation.)

Alexander


RE: [PATCH 3/4] char/tpm: Improve a size determination in nine functions

2017-10-20 Thread Alexander.Steffen
> On Fri, Oct 20, 2017 at 12:01:39PM +0300, Jarkko Sakkinen wrote:
> > On Thu, Oct 19, 2017 at 04:58:23PM +,
> alexander.stef...@infineon.com wrote:
> > > > On Tue, Oct 17, 2017 at 11:50:05AM +,
> alexander.stef...@infineon.com
> > > > wrote:
> > > > > > > Replace the specification of data structures by pointer
> dereferences
> > > > > > > as the parameter for the operator "sizeof" to make the
> corresponding
> > > > > > > size
> > > > > > > determination a bit safer according to the Linux coding style
> > > > > > > convention.
> > > > > >
> > > > > >
> > > > > > This patch does one style in favor of the other.
> > > > >
> > > > > I actually prefer that style, so I'd welcome this change :)
> > > > >
> > > > > > At the end it's Jarkko's call, though I would NAK this as I think 
> > > > > > some
> > > > > > one already told this to you for some other similar patch(es).
> > > > > >
> > > > > >
> > > > > > I even would suggest to stop doing this noisy stuff, which keeps
> people
> > > > > > busy for nothing.
> > > > >
> > > > > Cleaning up old code is also worth something, even if does not change
> > > > > one bit in the assembly output in the end...
> > > > >
> > > > > Alexander
> > > >
> > > > Quite insignificant clean up it is that does more harm that gives any
> > > > benefit as any new change adds debt to backporting.
> > > >
> > > > Anyway, this has been a useful patch set for me in the sense that I have
> > > > clearer picture now on discarding/accepting commits.
> > >
> > > Indeed. I have now a better understanding for why some code looks as
> > > ugly as it does.
> > >
> > > > One line minor
> > > > clean up will be from now on automatic NAK unless it causes a compiler
> > > > warning or some other visible side-effect.
> > >
> > > Not a nice policy, but at least a policy. I have deleted the tasks
> > > that I had still planned for other cleanup activities.
> > >
> > > Alexander
> >
> > 1/4 and 2/4 are sensible clean ups as long as the commit message is
> > refined.
> >
> > Moving more functions to use tpm_buf instead of nasty tpm_cmd_t are
> > also welcome changes.
> >
> > Documenting functions (exported mainly) is also welcome. Or refining
> > documentation.
> >
> > It's really case by case. The important thing in small clean ups is
> > a clearly written commit message that explains rationale.
> >
> > /Jarkko
> 
> It's easy to say in retroperpective that code is "ugly". I would use
> strong consideration before using that adjective for mainline code.
> Rarely when you do something new first the form will be polished.

(Let's not argue over words, not being a native speaker, I'll probably not 
choose the perfect expression all the time.)

My assumption is that in many cases the code was not like that from the 
beginning. Over time, new functionality got added, interfaces expanded, etc. 
But when the goal tends to be to keep the changes minimal, then it is only 
natural for everyone to be focused on their small parts of the code, and not 
clean up the surrounding areas (or better integrate with them).

> I was too steep with the policy above. It's not exactly like that in the
> strict sense. It's always case by case like for any commit. However, it
> is good to remember that "ugliness" does not cause regressions.

Not by itself, no. But it causes cognitive strain while working with the code, 
and that might lead to bugs.

Alexander


RE: [PATCH v4 2/4] tpm: ignore burstcount to improve tpm_tis send() performance

2017-11-21 Thread Alexander.Steffen
> > > On 10/20/2017 08:12 PM, alexander.stef...@infineon.com wrote:
> > > >> The TPM burstcount status indicates the number of bytes that can
> > > >> be sent to the TPM without causing bus wait states.  Effectively,
> > > >> it is the number of empty bytes in the command FIFO.
> > > >>
> > > >> This patch optimizes the tpm_tis_send_data() function by checking
> > > >> the burstcount only once. And if the burstcount is valid, it writes
> > > >> all the bytes at once, permitting wait state.
> > > >>
> > > >> After this change, performance on a TPM 1.2 with an 8 byte
> > > >> burstcount for 1000 extends improved from ~41sec to ~14sec.
> > > >>
> > > >> Suggested-by: Ken Goldman  in
> > > >> conjunction with the TPM Device Driver work group.
> > > >> Signed-off-by: Nayna Jain
> > > >> Acked-by: Mimi Zohar
> > > >> ---
> > > >>   drivers/char/tpm/tpm_tis_core.c | 42 +++--
> --
> > --
> > > 
> > > >> 
> > > >>   1 file changed, 15 insertions(+), 27 deletions(-)
> > > >>
> > > >> diff --git a/drivers/char/tpm/tpm_tis_core.c
> > > >> b/drivers/char/tpm/tpm_tis_core.c
> > > >> index b33126a35694..993328ae988c 100644
> > > >> --- a/drivers/char/tpm/tpm_tis_core.c
> > > >> +++ b/drivers/char/tpm/tpm_tis_core.c
> > > >> @@ -316,7 +316,6 @@ static int tpm_tis_send_data(struct tpm_chip
> > > *chip,
> > > >> u8 *buf, size_t len)
> > > >>   {
> > > >>struct tpm_tis_data *priv = dev_get_drvdata(>dev);
> > > >>int rc, status, burstcnt;
> > > >> -  size_t count = 0;
> > > >>bool itpm = priv->flags & TPM_TIS_ITPM_WORKAROUND;
> > > >>
> > > >>status = tpm_tis_status(chip);
> > > >> @@ -330,35 +329,24 @@ static int tpm_tis_send_data(struct
> tpm_chip
> > > *chip,
> > > >> u8 *buf, size_t len)
> > > >>}
> > > >>}
> > > >>
> > > >> -  while (count < len - 1) {
> > > >> -  burstcnt = get_burstcount(chip);
> > > >> -  if (burstcnt < 0) {
> > > >> -  dev_err(>dev, "Unable to read
> burstcount\n");
> > > >> -  rc = burstcnt;
> > > >> -  goto out_err;
> > > >> -  }
> > > >> -  burstcnt = min_t(int, burstcnt, len - count - 1);
> > > >> -  rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv-
> > > >>> locality),
> > > >> -   burstcnt, buf + count);
> > > >> -  if (rc < 0)
> > > >> -  goto out_err;
> > > >> -
> > > >> -  count += burstcnt;
> > > >> -
> > > >> -  if (wait_for_tpm_stat(chip, TPM_STS_VALID, chip-
> > > >>> timeout_c,
> > > >> -  >int_queue, false) < 0) {
> > > >> -  rc = -ETIME;
> > > >> -  goto out_err;
> > > >> -  }
> > > >> -  status = tpm_tis_status(chip);
> > > >> -  if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0)
> {
> > > >> -  rc = -EIO;
> > > >> -  goto out_err;
> > > >> -  }
> > > >> +  /*
> > > >> +   * Get the initial burstcount to ensure TPM is ready to
> > > >> +   * accept data.
> > > >> +   */
> > > >> +  burstcnt = get_burstcount(chip);
> > > >> +  if (burstcnt < 0) {
> > > >> +  dev_err(>dev, "Unable to read burstcount\n");
> > > >> +  rc = burstcnt;
> > > >> +  goto out_err;
> > > >>}
> > > >>
> > > >> +  rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv-
> >locality),
> > > >> +  len - 1, buf);
> > > >> +  if (rc < 0)
> > > >> +  goto out_err;
> > > >> +
> > > >>/* write last byte */
> > > >> -  rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality),
> > > >> buf[count]);
> > > >> +  rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality),
> buf[len-
> > > >> 1]);
> > > >>if (rc < 0)
> > > >>goto out_err;
> > > >>
> > > >> --
> > > >> 2.13.3
> > > > This seems to fail reliably with my SPI TPM 2.0. I get EIO when trying 
> > > > to
> > > send large amounts of data, e.g. with TPM2_Hash, and subsequent tests
> > > seem to take an unusual amount of time. More analysis probably has to
> > wait
> > > until November, since I am going to be in Prague next week.
> > >
> > > Thanks Alex for testing these.. Did you get the chance to do any further
> > > analysis ?
> >
> > I am working on that now. Ken's suggestion seems reasonable, so I am
> going
> > to test whether correctly waiting for the flags to change fixes the problem.
> If
> > it does, I'll send the patches.
> 
> Sorry for the delay, I had to take care of some device tree changes in v4.14
> that broke my ARM test machines.
> 
> I've implemented some patches that fix the issue that Ken pointed out and
> rebased your patch 2/4 ("ignore burstcount") on top. While doing 

RE: [PATCH v4 2/4] tpm: ignore burstcount to improve tpm_tis send() performance

2017-11-23 Thread Alexander.Steffen
> On Wed, Nov 22, 2017 at 06:52:03AM +,
> alexander.stef...@infineon.com wrote:
> > > > > > This seems to fail reliably with my SPI TPM 2.0. I get EIO when 
> > > > > > trying
> to
> > > > > send large amounts of data, e.g. with TPM2_Hash, and subsequent
> tests
> > > > > seem to take an unusual amount of time. More analysis probably has
> to
> > > > wait
> > > > > until November, since I am going to be in Prague next week.
> > > > >
> > > > > Thanks Alex for testing these.. Did you get the chance to do any
> further
> > > > > analysis ?
> > > >
> > > > I am working on that now. Ken's suggestion seems reasonable, so I am
> > > going
> > > > to test whether correctly waiting for the flags to change fixes the
> problem.
> > > If
> > > > it does, I'll send the patches.
> > >
> > > Sorry for the delay, I had to take care of some device tree changes in
> v4.14
> > > that broke my ARM test machines.
> > >
> > > I've implemented some patches that fix the issue that Ken pointed out
> and
> > > rebased your patch 2/4 ("ignore burstcount") on top. While doing this I
> > > noticed that your original patch does not, as the commit message says,
> write
> > > all the bytes at once, but still unnecessarily splits all commands into at
> least
> > > two transfers (as did the original code). I've fixed this as well in my
> patches,
> > > so that all bytes are indeed sent in a single call, without special 
> > > handling
> for
> > > the last byte. This should speed up things further, especially for small
> > > commands and drivers like tpm_tis_spi, where writing a single byte
> > > translates into additional SPI transfers.
> 
> Thanks Alex, for digging into.
> 
> Yeah, you are right, the first version of this patch sent all the bytes 
> together,
> but after hearing ddwg inputs,
> i.e. "The last byte was introduced for error checking purposes (history).", I
> reverted back to original to be safe.
> 
> It seems that the last byte was sent from the beginning (27084ef [PATCH]
> tpm: driver for next generation TPM chips,),
> does anyone remember the reason ?

The intention seems to be to make extra sure that the TPM has correctly 
understood the command by observing the Expect flag flipping from 1 to 0 when 
writing the last byte.

But following Ken's arguments, this does not work as intended, because the 
Expect flag will change not when writing the last byte to the FIFO, but when 
the TPM reads the last byte from the FIFO. Since there is no "FIFO empty" 
indication, just observing the Expect flag to be 1 before writing the last 
byte, cannot reliably tell us anything (there might be enough data left in the 
FIFO for the Expect flag to flip to 0 without writing the last byte).

Also, I'd argue that this check is not necessary, because if the Expect flag is 
0 after all bytes have been written to the FIFO, then the TPM has correctly 
received the command and is ready to execute it. According to TIS/PTP the TPM 
is required to throw away all extra bytes that were not announced in the 
header, and in addition the kernel driver already ensures not to send more 
data. That are enough safeguards, I'd say.

> 
> > >
> > > Unfortunately, even with those changes the problem persists. But I've
> got
> > > more detailed logs now and will try to understand and hopefully fix the
> issue.
> > > I'll follow up with more details and/or patches once I know more.
> >
> > Okay, so the problem seems to be that at some point the TPM starts
> inserting wait states for the FIFO access. The driver tries to handle this, 
> but
> fails since even the 50 retries that are currently used do not seem to be
> enough. Adding small (millisecond) delays between the attempts did not
> help so far.
> >
> > Is there any limit in the specification for how many wait states the TPM may
> generate or for how long it may do so? I could not find anything, but we need
> to use something there to prevent a faulty TPM from blocking the kernel
> forever.
> >
> 
> I have been thinking on this, so was wondering:
> 
> 1. As you said the problem started while sending large amounts of data for
> TPM2_Hash, how large is "large" ? I mean did it work for some specific large
> values before failing.

Around 1k of data (the exact values are chosen randomly, and it failed many 
times), but I did not try to find a specific boundary. The interesting thing 
was that for this long command all SPI frames with the maximum payload of 64 
bytes were accepted without wait states, but the last frame (with less than 64 
bytes) caused the wait states.

> 2. Are these wait states limited to SPI, or does it happen on LPC as well?

I do not know for LPC because there the wait states are handled in hardware and 
I cannot trace the LPC signals.

> Thanks & Regards,
>- Nayna
> 
> 
> > Alexander
> >
> 


RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`

2017-12-14 Thread Alexander.Steffen
> [Mario from Dell added to CC list.]
> 
> Dear Alexander,
> 
> 
> On 12/11/17 17:08, alexander.stef...@infineon.com wrote:
> 
> >> On 12/08/17 17:18, Jason Gunthorpe wrote:
> >>> On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
> >>>
>  I have no access to the system right now, but want to point out, that
> the
>  log was created by `journactl -k`, so I do not know if that messes with
> the
>  time stamps. I checked the output of `dmesg` but didn’t see the TPM
> error
>  messages in the output – only `tpm_tis MSFT0101:00: 2.0 TPM (device-
> id 0xFE,
>  rev-id 4)`. Do I need to pass a different error message to `dmesg`?
> >>>
> >>> It is a good question, I don't know.. If your kernel isn't setup to
> >>> timestamp messages then the journalstamp will certainly be garbage.
> >>>
> >>> No idea why you wouldn't see the messages in dmesg, if they are not in
> >>> dmesg they couldn't get into the journal
> >>
> >> It looks like I was running an older Linux kernel version, when running
> >> `dmesg`. Sorry for the noise. Here are the messages with the Linux
> >> kernel time stamps, showing that the delays work correctly.
> >>
> >> ```
> >> $ uname -a
> >> Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3
> >> 17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> >> $ sudo dmesg | grep TPM
> >> [0.00] ACPI: TPM2 0x6F332168 34 (v03Tpm2Tabl
> >> 0001 AMI  )
> >> [1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> >> [1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [3.734808] tpm tpm0: TPM self test failed
> >> [3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
> >> ```
> >
> > Thanks for the fixed log. So your TPM seems to be rather slow with
> executing the selftests. Could try to apply the patch that I've just sent 
> you? It
> ensures that your TPM gets more time to execute all the tests, up to the limit
> set in the PTP.
> 
> Thank you for your patch. Judging from the time stamps, it seems it
> works, but the TPM still fails.
> 
> ```
> $ dmesg | grep tpm
> [1.100958] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> [1.111768] tpm tpm0: A TPM error (2314) occurred continue selftest
> [1.143020] tpm tpm0: A TPM error (2314) occurred continue selftest
> [1.194251] tpm tpm0: A TPM error (2314) occurred continue selftest
> [1.285509] tpm tpm0: A TPM error (2314) occurred continue selftest
> [1.457103] tpm tpm0: A TPM error (2314) occurred continue selftest
> [1.788709] tpm tpm0: A TPM error (2314) occurred continue selftest
> [2.440216] tpm tpm0: A TPM error (2314) occurred continue selftest
> [3.731704] tpm tpm0: A TPM error (2314) occurred continue selftest
> [6.303216] tpm tpm0: A TPM error (2314) occurred continue selftest
> [6.303242] tpm tpm0: TPM self test failed
> ```
> 
> To be clear, this issue is not reproducible during every start. (But
> that was the same before.)

Thanks for testing. Now you are in the unlucky situation that your TPM was 
probably always broken, but old kernels did not detect that and used it anyway.

To add some more details to what the problem is: The PTP limits the maximum 
runtime of the TPM2_SelfTest command that we try to execute here to 2000ms (see 
https://trustedcomputinggroup.org/wp-content/uploads/TCG_PC_Client_Platform_TPM_Profile_PTP_Specification_Family_2.0_Revision_1.3v22.pdf
 table 15, page 65 in the PDF, page 57 according to the printed page numbers). 
Technically, we have no evidence that your TPM is in violation of that 
specification, because it does reply to the command within 2000ms, it just has 
not completed the selftests within that timeframe. But clearly the intention of 
the specification authors was to have the selftests completed within that 
limit, there is no sense in allowing 2s just for the TPM to generate an answer 
without actually making any progress.

The TPM2_SelfTest command is special in that it is allowed to either execute 
all selftests and then return TPM_RC_SUCCESS or just schedule the selftest 
execution in the background and return TPM_RC_TESTING immediately (see 
https://trustedcomputinggroup.org/wp-content/uploads/TPM-Rev-2.0-Part-3-Commands-01.38.pdf
 chapter 10.2.1, page 43/29). Your TPM apparently chooses the second option, 
but (sometimes?) fails to complete the selftests within the limit that we set 
(which is far longer than the 2s from the PTP).

I'm not sure what to do about 

RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`

2017-12-14 Thread Alexander.Steffen
> > -Original Message-
> > From: alexander.stef...@infineon.com
> [mailto:alexander.stef...@infineon.com]
> > Sent: Thursday, December 14, 2017 6:21 AM
> > To: pmen...@molgen.mpg.de; j...@ziepe.ca
> > Cc: linux-integr...@vger.kernel.org; linux-kernel@vger.kernel.org;
> Limonciello,
> > Mario 
> > Subject: RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error
> (2314)
> > occurred continue selftest`
> >
> > > [Mario from Dell added to CC list.]
> > >
> > > Dear Alexander,
> > >
> > >
> > > On 12/11/17 17:08, alexander.stef...@infineon.com wrote:
> > >
> > > >> On 12/08/17 17:18, Jason Gunthorpe wrote:
> > > >>> On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
> > > >>>
> > >  I have no access to the system right now, but want to point out,
> that
> > > the
> > >  log was created by `journactl -k`, so I do not know if that messes
> with
> > > the
> > >  time stamps. I checked the output of `dmesg` but didn’t see the
> TPM
> > > error
> > >  messages in the output – only `tpm_tis MSFT0101:00: 2.0 TPM
> (device-
> > > id 0xFE,
> > >  rev-id 4)`. Do I need to pass a different error message to `dmesg`?
> > > >>>
> > > >>> It is a good question, I don't know.. If your kernel isn't setup to
> > > >>> timestamp messages then the journalstamp will certainly be
> garbage.
> > > >>>
> > > >>> No idea why you wouldn't see the messages in dmesg, if they are
> not in
> > > >>> dmesg they couldn't get into the journal
> > > >>
> > > >> It looks like I was running an older Linux kernel version, when running
> > > >> `dmesg`. Sorry for the noise. Here are the messages with the Linux
> > > >> kernel time stamps, showing that the delays work correctly.
> > > >>
> > > >> ```
> > > >> $ uname -a
> > > >> Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3
> > > >> 17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> > > >> $ sudo dmesg | grep TPM
> > > >> [0.00] ACPI: TPM2 0x6F332168 34 (v03
> Tpm2Tabl
> > > >> 0001 AMI  )
> > > >> [1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> > > >> [1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [3.734808] tpm tpm0: TPM self test failed
> > > >> [3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
> > > >> ```
> > > >
> > > > Thanks for the fixed log. So your TPM seems to be rather slow with
> > > executing the selftests. Could try to apply the patch that I've just sent
> you? It
> > > ensures that your TPM gets more time to execute all the tests, up to the
> limit
> > > set in the PTP.
> > >
> > > Thank you for your patch. Judging from the time stamps, it seems it
> > > works, but the TPM still fails.
> > >
> > > ```
> > > $ dmesg | grep tpm
> > > [1.100958] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> > > [1.111768] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [1.143020] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [1.194251] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [1.285509] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [1.457103] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [1.788709] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [2.440216] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [3.731704] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [6.303216] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [6.303242] tpm tpm0: TPM self test failed
> > > ```
> > >
> > > To be clear, this issue is not reproducible during every start. (But
> > > that was the same before.)
> >
> > Thanks for testing. Now you are in the unlucky situation that your TPM was
> > probably always broken, but old kernels did not detect that and used it
> anyway.
> >
> Something that Paul can consider is to upgrade the TPM firmware if it's not
> already
> upgraded.  Since the launch of XPS 9360 there was at least one TPM firmware
> update
> issued.  It has been posted to LVFS and can be upgraded using
> fwupd/fwupdate.
> Note: If your TPM is currently owned you will need to go into BIOS setup to
> clear it
> first before upgrading.

I'm not familiar with the specific TPM in your model, but according to the log 
it is a TPM 2.0, which does not really carry over the owner concept of a TPM 
1.2. Is clearing it still necessary for an upgrade then?

> I 

RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`

2017-12-15 Thread Alexander.Steffen
> [Adding Rafael and Len as they, to my knowledge, also use or have a
> access to a Dell XPS 13 9360. With latest Linux master do you get TPM
> self-test errors, when cold starting the system without the power supply
> plugged in?]
> 
> Dear Mario, dear Alexander,
> 
> 
> the added line breaks to the quoted parts really mess up the citation.
> Can we please try to use MUAs avoiding that, or fixing that manually?

Sorry, I'm not sure whether my company has a way for me to avoid using Outlook 
;-) But if there are any configuration changes to make it behave better, I will 
gladly apply them. Do you know of any documentation on this? All I found so far 
either is already applied or was outdated.

I'll remove some of the less relevant quoted parts, so that this is less of an 
issue.

> > To be clear, this issue is not reproducible during every start. (But
> > that was the same before.)
> 
> I think I found out how to reproduce the issue. Cold start the system
> without the power supply connected.
> 
>  Thanks for testing. Now you are in the unlucky situation that your TPM
> was
>  probably always broken, but old kernels did not detect that and used it
> anyway.
> 
> Just to clarify, I do not know if the TPM could ever be used. I believe
> the module loaded but the user space tools (tpm2_version or so) always
> returned an error in my tests.

Interesting. So maybe it is not a bug in your TPM's firmware, but really a 
single defective TPM? Can you try to figure that out? That is, when using an 
older kernel in the cold start scenario, can you execute any useful commands on 
your TPM successfully?

> >>> Something that Paul can consider is to upgrade the TPM firmware if it's
> not
> >>> already
> >>> upgraded.  Since the launch of XPS 9360 there was at least one TPM
> firmware
> >>> update
> >>> issued.  It has been posted to LVFS and can be upgraded using
> >>> fwupd/fwupdate.
> >>> Note: If your TPM is currently owned you will need to go into BIOS setup
> to
> >>> clear it
> >>> first before upgrading.
> >>
> >> I'm not familiar with the specific TPM in your model, but according to the
> log it is a
> >> TPM 2.0, which does not really carry over the owner concept of a TPM 1.2.
> Is
> >> clearing it still necessary for an upgrade then?
> >
> > Yes it's required for the TPM model/vendor that is used in the XPS model
> that
> > Paul has.  If you try to run the upgrade without clearing it the firmware 
> > will
> > reject the upgrade.
> 
> Mario, thank you for your quick reaction.
> 
> […]
> 
> 1.  Can you reproduce this issue too?
> 2.  How do I find out, what TPM firmware version is installed?

If you get the driver loaded, you can ask the TPM (TPM2_GetCapability for 
TPM_PT_FIRMWARE_VERSION_1 and TPM_PT_FIRMWARE_VERSION_2):

python3 -c 'f=open("/dev/tpm0", "r+b", buffering=0); 
f.write(b"\x80\x01\x00\x00\x00\x16\x00\x00\x01z\x00\x00\x00\x06\x00\x00\x01\x0b\x00\x00\x00\x02");
 print(f.readall())'

> 3.  Updating to the firmware 2.4.2 from December 17th, 2017 didn’t fix
> the issue.

You've got a firmware from the future? ;-)

Alexander


RE: [PATCH v4 2/4] tpm: ignore burstcount to improve tpm_tis send() performance

2017-11-16 Thread Alexander.Steffen
> > On 10/20/2017 08:12 PM, alexander.stef...@infineon.com wrote:
> > >> The TPM burstcount status indicates the number of bytes that can
> > >> be sent to the TPM without causing bus wait states.  Effectively,
> > >> it is the number of empty bytes in the command FIFO.
> > >>
> > >> This patch optimizes the tpm_tis_send_data() function by checking
> > >> the burstcount only once. And if the burstcount is valid, it writes
> > >> all the bytes at once, permitting wait state.
> > >>
> > >> After this change, performance on a TPM 1.2 with an 8 byte
> > >> burstcount for 1000 extends improved from ~41sec to ~14sec.
> > >>
> > >> Suggested-by: Ken Goldman  in
> > >> conjunction with the TPM Device Driver work group.
> > >> Signed-off-by: Nayna Jain
> > >> Acked-by: Mimi Zohar
> > >> ---
> > >>   drivers/char/tpm/tpm_tis_core.c | 42 +++
> --
> > 
> > >> 
> > >>   1 file changed, 15 insertions(+), 27 deletions(-)
> > >>
> > >> diff --git a/drivers/char/tpm/tpm_tis_core.c
> > >> b/drivers/char/tpm/tpm_tis_core.c
> > >> index b33126a35694..993328ae988c 100644
> > >> --- a/drivers/char/tpm/tpm_tis_core.c
> > >> +++ b/drivers/char/tpm/tpm_tis_core.c
> > >> @@ -316,7 +316,6 @@ static int tpm_tis_send_data(struct tpm_chip
> > *chip,
> > >> u8 *buf, size_t len)
> > >>   {
> > >>  struct tpm_tis_data *priv = dev_get_drvdata(>dev);
> > >>  int rc, status, burstcnt;
> > >> -size_t count = 0;
> > >>  bool itpm = priv->flags & TPM_TIS_ITPM_WORKAROUND;
> > >>
> > >>  status = tpm_tis_status(chip);
> > >> @@ -330,35 +329,24 @@ static int tpm_tis_send_data(struct tpm_chip
> > *chip,
> > >> u8 *buf, size_t len)
> > >>  }
> > >>  }
> > >>
> > >> -while (count < len - 1) {
> > >> -burstcnt = get_burstcount(chip);
> > >> -if (burstcnt < 0) {
> > >> -dev_err(>dev, "Unable to read 
> > >> burstcount\n");
> > >> -rc = burstcnt;
> > >> -goto out_err;
> > >> -}
> > >> -burstcnt = min_t(int, burstcnt, len - count - 1);
> > >> -rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv-
> > >>> locality),
> > >> - burstcnt, buf + count);
> > >> -if (rc < 0)
> > >> -goto out_err;
> > >> -
> > >> -count += burstcnt;
> > >> -
> > >> -if (wait_for_tpm_stat(chip, TPM_STS_VALID, chip-
> > >>> timeout_c,
> > >> ->int_queue, false) < 0) {
> > >> -rc = -ETIME;
> > >> -goto out_err;
> > >> -}
> > >> -status = tpm_tis_status(chip);
> > >> -if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0) {
> > >> -rc = -EIO;
> > >> -goto out_err;
> > >> -}
> > >> +/*
> > >> + * Get the initial burstcount to ensure TPM is ready to
> > >> + * accept data.
> > >> + */
> > >> +burstcnt = get_burstcount(chip);
> > >> +if (burstcnt < 0) {
> > >> +dev_err(>dev, "Unable to read burstcount\n");
> > >> +rc = burstcnt;
> > >> +goto out_err;
> > >>  }
> > >>
> > >> +rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv->locality),
> > >> +len - 1, buf);
> > >> +if (rc < 0)
> > >> +goto out_err;
> > >> +
> > >>  /* write last byte */
> > >> -rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality),
> > >> buf[count]);
> > >> +rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality), 
> > >> buf[len-
> > >> 1]);
> > >>  if (rc < 0)
> > >>  goto out_err;
> > >>
> > >> --
> > >> 2.13.3
> > > This seems to fail reliably with my SPI TPM 2.0. I get EIO when trying to
> > send large amounts of data, e.g. with TPM2_Hash, and subsequent tests
> > seem to take an unusual amount of time. More analysis probably has to
> wait
> > until November, since I am going to be in Prague next week.
> >
> > Thanks Alex for testing these.. Did you get the chance to do any further
> > analysis ?
> 
> I am working on that now. Ken's suggestion seems reasonable, so I am going
> to test whether correctly waiting for the flags to change fixes the problem. 
> If
> it does, I'll send the patches.

Sorry for the delay, I had to take care of some device tree changes in v4.14 
that broke my ARM test machines.

I've implemented some patches that fix the issue that Ken pointed out and 
rebased your patch 2/4 ("ignore burstcount") on top. While doing this I noticed 
that your original patch does not, as the commit message says, write all the 
bytes at once, but still unnecessarily splits 

RE: [PATCH v4 2/4] tpm: ignore burstcount to improve tpm_tis send() performance

2017-11-08 Thread Alexander.Steffen
> On 10/20/2017 08:12 PM, alexander.stef...@infineon.com wrote:
> >> The TPM burstcount status indicates the number of bytes that can
> >> be sent to the TPM without causing bus wait states.  Effectively,
> >> it is the number of empty bytes in the command FIFO.
> >>
> >> This patch optimizes the tpm_tis_send_data() function by checking
> >> the burstcount only once. And if the burstcount is valid, it writes
> >> all the bytes at once, permitting wait state.
> >>
> >> After this change, performance on a TPM 1.2 with an 8 byte
> >> burstcount for 1000 extends improved from ~41sec to ~14sec.
> >>
> >> Suggested-by: Ken Goldman  in
> >> conjunction with the TPM Device Driver work group.
> >> Signed-off-by: Nayna Jain
> >> Acked-by: Mimi Zohar
> >> ---
> >>   drivers/char/tpm/tpm_tis_core.c | 42 +++--
> 
> >> 
> >>   1 file changed, 15 insertions(+), 27 deletions(-)
> >>
> >> diff --git a/drivers/char/tpm/tpm_tis_core.c
> >> b/drivers/char/tpm/tpm_tis_core.c
> >> index b33126a35694..993328ae988c 100644
> >> --- a/drivers/char/tpm/tpm_tis_core.c
> >> +++ b/drivers/char/tpm/tpm_tis_core.c
> >> @@ -316,7 +316,6 @@ static int tpm_tis_send_data(struct tpm_chip
> *chip,
> >> u8 *buf, size_t len)
> >>   {
> >>struct tpm_tis_data *priv = dev_get_drvdata(>dev);
> >>int rc, status, burstcnt;
> >> -  size_t count = 0;
> >>bool itpm = priv->flags & TPM_TIS_ITPM_WORKAROUND;
> >>
> >>status = tpm_tis_status(chip);
> >> @@ -330,35 +329,24 @@ static int tpm_tis_send_data(struct tpm_chip
> *chip,
> >> u8 *buf, size_t len)
> >>}
> >>}
> >>
> >> -  while (count < len - 1) {
> >> -  burstcnt = get_burstcount(chip);
> >> -  if (burstcnt < 0) {
> >> -  dev_err(>dev, "Unable to read burstcount\n");
> >> -  rc = burstcnt;
> >> -  goto out_err;
> >> -  }
> >> -  burstcnt = min_t(int, burstcnt, len - count - 1);
> >> -  rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv-
> >>> locality),
> >> -   burstcnt, buf + count);
> >> -  if (rc < 0)
> >> -  goto out_err;
> >> -
> >> -  count += burstcnt;
> >> -
> >> -  if (wait_for_tpm_stat(chip, TPM_STS_VALID, chip-
> >>> timeout_c,
> >> -  >int_queue, false) < 0) {
> >> -  rc = -ETIME;
> >> -  goto out_err;
> >> -  }
> >> -  status = tpm_tis_status(chip);
> >> -  if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0) {
> >> -  rc = -EIO;
> >> -  goto out_err;
> >> -  }
> >> +  /*
> >> +   * Get the initial burstcount to ensure TPM is ready to
> >> +   * accept data.
> >> +   */
> >> +  burstcnt = get_burstcount(chip);
> >> +  if (burstcnt < 0) {
> >> +  dev_err(>dev, "Unable to read burstcount\n");
> >> +  rc = burstcnt;
> >> +  goto out_err;
> >>}
> >>
> >> +  rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv->locality),
> >> +  len - 1, buf);
> >> +  if (rc < 0)
> >> +  goto out_err;
> >> +
> >>/* write last byte */
> >> -  rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality),
> >> buf[count]);
> >> +  rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality), buf[len-
> >> 1]);
> >>if (rc < 0)
> >>goto out_err;
> >>
> >> --
> >> 2.13.3
> > This seems to fail reliably with my SPI TPM 2.0. I get EIO when trying to
> send large amounts of data, e.g. with TPM2_Hash, and subsequent tests
> seem to take an unusual amount of time. More analysis probably has to wait
> until November, since I am going to be in Prague next week.
> 
> Thanks Alex for testing these.. Did you get the chance to do any further
> analysis ?

I am working on that now. Ken's suggestion seems reasonable, so I am going to 
test whether correctly waiting for the flags to change fixes the problem. If it 
does, I'll send the patches.

Alexander


RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`

2017-12-08 Thread Alexander.Steffen
> On Thu, Dec 07, 2017 at 03:56:07PM +, alexander.stef...@infineon.com
> wrote:
> 
> > > If these are intentional, it’d be great
> > > to give some hint to the user, what effect this has.
> >
> > I agree that those error messages in their current form are not that
> > helpful for the users. But they are part of the general driver
> > architecture, and are also caused by other parts of the code
> > (e.g. when using a TPM 1.2 that is deactivated or when the platform
> > did not send a startup command). We should find a way to hide (or at
> > least mark) those kind of expected errors.
> 
> Other parts of the TPM code did supresses 'expected' errors like this,
> I'm not sure if it got removed during Jarkko's cleanup though - we
> need to put stuff like that back, it should not printk for something
> like this.

Yes, I've got this task here somewhere, just no time to do it...

> For this, if we are waiting then it should compute an absolute time
> after which it will give up.
> 
> Code it like this instead and get rid of the ugly 'duration' scheme.
> 
> ktime_t stop = ktime_add_ns(ktime_get(), [timeout in ns]);
> do
> {
> }
> while (ktime_before(ktime_get(), stop);

Is it really that ugly? I still need delay_msec to increase the delay each 
round. I can see the benefit of your suggestion when it is important to get the 
timing exactly right (and also account for time spent elsewhere, when our code 
might not be executing). But in this case having delays that are approximately 
right (or longer than intended) is sufficient.

Anyway, from the log messages it is clear that tpm_msleep got called seven 
times with delays of 20/40/80/160/320/640/1280ms. But still all timestamps lie 
within the same second. How can this be with a cumulated delay of ~2.5s?

Also, I've just noticed that despite the name tpm_msleep calls usleep_range, 
not msleep. Can this have an influence? Should tpm_msleep call msleep for 
longer delays, as suggested by Documentation/timers/timers-howto.txt?

Alexander


RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`

2017-12-07 Thread Alexander.Steffen
> Dear Linux folks,
> 
> 
> With Linux 4.15-rc2 built by the Ubuntu Kernel Team, the error messages
> below are shown by the Linux kernel. These are new.
> 
> ```
> Dez 06 13:22:24 Ixpees kernel: microcode: microcode updated early to
> revision 0x62, date = 2017-04-27
> Dez 06 13:22:24 Ixpees kernel: Linux version 4.15.0-041500rc2-generic
> (kernel@gloin) (gcc version 7.2.0 (Ubuntu 7.2.0-8ubuntu3)) #201712031230
> SMP Sun Dec 3 17:32:03 UTC 2017
> Dez 06 13:22:24 Ixpees kernel: Command line:
> BOOT_IMAGE=/vmlinuz-4.15.0-041500rc2-generic
> root=/dev/mapper/ubuntu--vg-root ro quiet splash vt.handoff=7
> Dez 06 13:22:24 Ixpees kernel: KERNEL supported cpus:
> Dez 06 13:22:24 Ixpees kernel:   Intel GenuineIntel
> Dez 06 13:22:24 Ixpees kernel:   AMD AuthenticAMD
> Dez 06 13:22:24 Ixpees kernel:   Centaur CentaurHauls
> Dez 06 13:22:24 Ixpees kernel: x86/fpu: Supporting XSAVE feature 0x001:
> 'x87 floating point registers'
> Dez 06 13:22:24 Ixpees kernel: x86/fpu: Supporting XSAVE feature 0x002:
> 'SSE registers'
> […]
> Dez 06 13:22:24 Ixpees kernel: tpm_tis MSFT0101:00: 2.0 TPM (device-id
> 0xFE, rev-id 4)
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: TPM self test failed
> ```
> 
> Any idea what to do about those?

The list of "A TPM error (2314) occurred continue selftest" is caused by my 
commit 125a2210541079e8e7c69e629ad06cabed788f8c ("tpm: React correctly to 
RC_TESTING from TPM 2.0 self tests"). 2314 is TPM_RC_TESTING, so the TPM tells 
us that self-tests are still running in the background. This problem was not 
visible in previous versions, since it (incorrectly) ignored TPM_RC_TESTING.

The final error "TPM self test failed" is then the driver giving up after too 
many TPM_RC_TESTING responses have been received.

What confuses me a bit is that according to your log all of that happens within 
the same second. The driver uses tpm2_calc_ordinal_duration for 
TPM2_CC_SELF_TEST which maps to TPM_LONG and finally to TPM2_DURATION_LONG = 
2000ms = 2s. So you should see those retries for at least two seconds. But 
since it does not give the TPM enough time to execute the tests, you see it 
failing in the end.

Could you try to find out how much time exactly (in milliseconds) it takes from 
the first "A TPM error" to the final "TPM self test failed" message? Is it 
possible that tpm_msleep delays for significantly less time in this case?

Also, while looking at the code I've noticed that the retry loop is not written 
in the best possible way and should actually try one more time. Could you make 
the following change to tpm2_do_selftest in drivers/char/tpm/tpm2-cmd.c and see 
whether that helps? 

---
duration = jiffies_to_msecs(
tpm2_calc_ordinal_duration(chip, TPM2_CC_SELF_TEST));
 
-   while (duration > 0) {
+   while (1) {
cmd.header.in = tpm2_selftest_header;
cmd.params.selftest_in.full_test = 0;
 
rc = tpm_transmit_cmd(chip, NULL, ,
TPM2_SELF_TEST_IN_SIZE,
  0, 0, "continue selftest");
 
-   if (rc != TPM2_RC_TESTING)
+   if (rc != TPM2_RC_TESTING || duration <= 0)
break;
 
tpm_msleep(delay_msec);
---

> If these are intentional, it’d be great
> to give some hint to the user, what effect this has.

I agree that those error messages in their current form are not that helpful 
for the users. But they are part of the general driver architecture, and are 
also caused by other parts of the code (e.g. when using a TPM 1.2 that is 
deactivated or when the platform did not send a startup command). We should 
find a way to hide (or at least mark) those kind of expected errors.

Alexander


RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`

2017-12-11 Thread Alexander.Steffen
> Dear Jason,
> 
> 
> On 12/08/17 17:18, Jason Gunthorpe wrote:
> > On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
> >
> >> I have no access to the system right now, but want to point out, that the
> >> log was created by `journactl -k`, so I do not know if that messes with the
> >> time stamps. I checked the output of `dmesg` but didn’t see the TPM
> error
> >> messages in the output – only `tpm_tis MSFT0101:00: 2.0 TPM (device-id
> 0xFE,
> >> rev-id 4)`. Do I need to pass a different error message to `dmesg`?
> >
> > It is a good question, I don't know.. If your kernel isn't setup to
> > timestamp messages then the journalstamp will certainly be garbage.
> >
> > No idea why you wouldn't see the messages in dmesg, if they are not in
> > dmesg they couldn't get into the journal
> 
> It looks like I was running an older Linux kernel version, when running
> `dmesg`. Sorry for the noise. Here are the messages with the Linux
> kernel time stamps, showing that the delays work correctly.
> 
> ```
> $ uname -a
> Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3
> 17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> $ sudo dmesg | grep TPM
> [0.00] ACPI: TPM2 0x6F332168 34 (v03Tpm2Tabl
> 0001 AMI  )
> [1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> [1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
> [1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
> [1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
> [1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
> [1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
> [1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
> [2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
> [3.734808] tpm tpm0: TPM self test failed
> [3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
> ```

Thanks for the fixed log. So your TPM seems to be rather slow with executing 
the selftests. Could try to apply the patch that I've just sent you? It ensures 
that your TPM gets more time to execute all the tests, up to the limit set in 
the PTP.

(I would have sent that patch also to linux-kernel@vger.kernel.org, since it 
was included in the discussion, but for some strange reason my mail system 
declared that to be an "Invalid address"...)

Alexander


RE: [PATCH 0/4] tpm: fix PS/2 devices not working on Braswell systems due CLKRUN enabled

2017-12-20 Thread Alexander.Steffen
> Hi Hans,
> 
> Thanks a lot for your feedback.
> 
> On 12/20/2017 12:43 PM, Hans de Goede wrote:
> > Hi,
> >
> > On 20-12-17 12:35, Javier Martinez Canillas wrote:
> >> Hello,
> >>
> >> Commit 5e572cab92f0 ("tpm: Enable CLKRUN protocol for Braswell
> systems")
> >> added logic in the TPM TIS driver to disable the Low Pin Count CLKRUN
> >> signal during TPM transactions.
> >>
> >> Unfortunately this breaks other devices that are attached to the LPC bus
> >> like for example PS/2 mouse and keyboards.
> >>
> >> The bug was reported to the Fedora kernel [0] and the kernel bugzilla [1].
> >> This issue and the propossed solution were discussed in this [2] thread,
> >> and the reporter (Jame Ettle) confirmed that his system works again after
> >> the fix in this series.
> >>
> >> The patches are based on top or Jarkko Sakkinen's linux-tpmdd [3] tree.
> >>
> >> James,
> >>
> >> Even when there shouldn't be any functional changes, I included some
> other
> >> fixes / cleanups in this series so it would be great if you can test them
> >> again. I can't because I don't have access to a machine affected by this.
> >>
> >> [0]: https://bugzilla.redhat.com/show_bug.cgi?id=1498987
> >> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=197287
> >> [2]: https://patchwork.kernel.org/patch/10119417/
> >> [3]: git.infradead.org/users/jjs/linux-tpmdd.git
> >>
> >> Best regards,
> >> Javier
> >>
> >>
> >> Javier Martinez Canillas (4):
> >>tpm: fix access attempt to an already unmapped I/O memory region
> >>tpm: delete the TPM_TIS_CLK_ENABLE flag
> >>tpm: follow coding style for variable declaration in
> >>  tpm_tis_core_init()
> >>tpm: only attempt to disable the LPC CLKRUN if is already enabled
> >>
> >>   drivers/char/tpm/tpm_tis.c  | 17 +
> >>   drivers/char/tpm/tpm_tis_core.c | 24 +---
> >>   drivers/char/tpm/tpm_tis_core.h |  1 -
> >>   3 files changed, 18 insertions(+), 24 deletions(-)
> >
> > Note I'm just reading along here, but I'm wondering if both the TPM
> > and now also some PS/2 controllers need CLK_RUN to be disabled,
> > why don't we just disable it once permanently and be done with it?
> >
> 
> That's the same question I asked to Azhar who authored the patch that
> added the CLKRUN toggle logic to the driver, but he didn't answer yet:
> 
> https://www.spinics.net/lists/linux-integrity/msg01107.html
> 
> My understanding is that by permanently disabling it, then other devices
> that do support the CLKRUN protocol will continue to work correctly since
> the CLKRUN# signal is optional and only used if the host LCLK is stopped.
> 
> The disadvantage though will be that the power management feature of
> stopping
> the LPC host LCLK clock when entering into low-power states will not be
> used.
> 
> > It seems that on machines with a PS/2 controller connected to
> > the LPC bus the BIOS is already doing this, so I've a feeling that
> > it not being done on devices with a TPM is a bug in the firmware
> 
> Absolutely agree, system integratos should make sure that all the devices
> connected to the LPC either have CLKRUN protocol support and is enabled
> or disable the CLKRUN protocol permanently.

As far as I understand it, this is exactly the issue here: They know that there 
are devices that do not support the CLKRUN protocol (the TPM in this case), but 
they still need to enable it to prevent other issues. So for the TPM to 
continue to work, CLKRUN needs to be disabled temporarily while the TPM is 
active.

> > there and we should just disable it everywhere (and probably
> > find a better place then the TPM driver to do the disabling).
> >
> 
> Indeed. Touching a global bus configuration in a driver for a single device
> isn't safe to say the least. One problem is that we don't have a LPC bus
> subsystem so that's why these sort of quirks are done at the driver level.
> 
> > Note this is just an observation, I could be completely wrong here,
> > but I've a feeling that just disabling CLKRUN all together is the
> > right thing to do and that seems like an easier fix to me.
> >
> 
> I think your observation is correct. The only problem is the power
> management
> feature being disabled as mentioned. Although as you said it seems that
> most
> BIOS do that (as shown by the patch I posted that just avoids toggling the
> CLKRUN state if it's already disabled).
> 
> > Specifically what worries me is: what if another driver also takes
> > the temporarily disable CLK_RUN approach because of similar issues?
> > Now we've 2 drivers playing with the CLKRUN state and racing with
> > each others.
> >
> 
> Agreed. You don't even need another driver, if a Braswell system has 2 TPMs
> then you have a race condition and one driver could enable the CLKRUN
> state
> while the other driver thinks that's already disabled, and TPM transactions
> won't work in that case.

Even though it is an unusual configuration to have multiple TPMs in a system, 
this is something that we 

RE: TPM driver breaks S3 suspend

2017-12-21 Thread Alexander.Steffen
> Hi,
> We have a desktop which has S3 suspend (to RAM) problem due to
> error messages as follows.
> [  198.908282] tpm tpm0: Error (38) sending savestate before suspend
> [  198.908289] __pnp_bus_suspend(): tpm_pm_suspend+0x0/0x160 returns
> 38
> [  198.908293] dpm_run_callback(): pnp_bus_suspend+0x0/0x20 returns 38
> [  198.908298] PM: Device 00:0b failed to suspend: error 38
> 
> However, the first suspend after boot is working although it still
> shows an interesting message during resume.
> [  155.789945] tpm tpm0: A TPM error (38) occurred continue selftest
> 
> The error code 38 in definition is TPM_ERR_INVALID_POSTINIT. I
> found some explanations which said this error code means that this
> command was received in the wrong sequence relative to a TPM_Startup
> command. Don't really know what happens here and how should I deal
> with this? Any suggestions? Please let me know what else information
> should I provide. Thanks
> 
> Chris

Just from looking at the code, this seems to be an issue in tpm_tis_resume.

When the device is not a TPM 2.0, it tries to execute the selftests, but 
ignores the results. In your case the selftests fail during resume, but since 
the error is ignored, the TPM device is still present (though non-functional) 
and so breaks the subsequent suspend.

In addition, from the error code we can tell that it is not actually a selftest 
failure. INVALID_POSTINIT for a command other than TPM_Startup means that no 
TPM_Startup has been executed for that power cycle yet, so the TPM has to 
reject all other commands. Usually, the platform sends the TPM_Startup command, 
but not in your case apparently.

The correct solution should be something like tpm2_auto_startup (execute 
selftests, if they fail because of the missing startup command, execute that 
and retry the selftests). Interestingly, tpm1_auto_startup (same purpose as 
tpm2_auto_startup, but for TPM 1.2 instead) does not use the same sequence, the 
startup-retry part is missing. Is there any reason this is done differently for 
TPM 1.2? Otherwise I'd propose to make tpm1_auto_startup follow the same 
sequence as tpm2_auto_startup and then call both from tpm_tis_resume, similar 
to what tpm_chip_register does.

Alexander


RE: [PATCH 3/4] char/tpm: Improve a size determination in nine functions

2017-10-19 Thread Alexander.Steffen
> On Tue, Oct 17, 2017 at 11:50:05AM +, alexander.stef...@infineon.com
> wrote:
> > > > Replace the specification of data structures by pointer dereferences
> > > > as the parameter for the operator "sizeof" to make the corresponding
> > > > size
> > > > determination a bit safer according to the Linux coding style
> > > > convention.
> > >
> > >
> > > This patch does one style in favor of the other.
> >
> > I actually prefer that style, so I'd welcome this change :)
> >
> > > At the end it's Jarkko's call, though I would NAK this as I think some
> > > one already told this to you for some other similar patch(es).
> > >
> > >
> > > I even would suggest to stop doing this noisy stuff, which keeps people
> > > busy for nothing.
> >
> > Cleaning up old code is also worth something, even if does not change
> > one bit in the assembly output in the end...
> >
> > Alexander
> 
> Quite insignificant clean up it is that does more harm that gives any
> benefit as any new change adds debt to backporting.
> 
> Anyway, this has been a useful patch set for me in the sense that I have
> clearer picture now on discarding/accepting commits.

Indeed. I have now a better understanding for why some code looks as ugly as it 
does.

> One line minor
> clean up will be from now on automatic NAK unless it causes a compiler
> warning or some other visible side-effect.

Not a nice policy, but at least a policy. I have deleted the tasks that I had 
still planned for other cleanup activities.

Alexander


RE: [PATCH v4 1/4] tpm: move wait_for_tpm_stat() to respective driver files

2017-10-19 Thread Alexander.Steffen
> On Tue, Oct 17, 2017 at 04:32:29PM -0400, Nayna Jain wrote:
> > The function wait_for_tpm_stat() is currently defined in
> > tpm-interface file. It is a hardware specific function used
> > only by tpm_tis and xen-tpmfront, so it is removed from
> > tpm-interface.c and defined in respective driver files.
> >
> > Suggested-by: Jarkko Sakkinen 
> > Signed-off-by: Nayna Jain 
> > Reviewed-by: Jarkko Sakkinen 
> > ---
> >  drivers/char/tpm/tpm-interface.c | 60 
> > 
> >  drivers/char/tpm/tpm.h   |  2 --
> >  drivers/char/tpm/tpm_tis_core.c  | 60
> 
> >  drivers/char/tpm/xen-tpmfront.c  | 60
> 
> >  4 files changed, 120 insertions(+), 62 deletions(-)
> >
> > diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-
> interface.c
> > index 1d6729be4cd6..313f7618d569 100644
> > --- a/drivers/char/tpm/tpm-interface.c
> > +++ b/drivers/char/tpm/tpm-interface.c
> > @@ -1035,66 +1035,6 @@ int tpm_send(u32 chip_num, void *cmd, size_t
> buflen)
> >  }
> >  EXPORT_SYMBOL_GPL(tpm_send);
> >
> > -static bool wait_for_tpm_stat_cond(struct tpm_chip *chip, u8 mask,
> > -   bool check_cancel, bool *canceled)
> > -{
> > -   u8 status = chip->ops->status(chip);
> > -
> > -   *canceled = false;
> > -   if ((status & mask) == mask)
> > -   return true;
> > -   if (check_cancel && chip->ops->req_canceled(chip, status)) {
> > -   *canceled = true;
> > -   return true;
> > -   }
> > -   return false;
> > -}
> > -
> > -int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask, unsigned long
> timeout,
> > - wait_queue_head_t *queue, bool check_cancel)
> > -{
> > -   unsigned long stop;
> > -   long rc;
> > -   u8 status;
> > -   bool canceled = false;
> > -
> > -   /* check current status */
> > -   status = chip->ops->status(chip);
> > -   if ((status & mask) == mask)
> > -   return 0;
> > -
> > -   stop = jiffies + timeout;
> > -
> > -   if (chip->flags & TPM_CHIP_FLAG_IRQ) {
> > -again:
> > -   timeout = stop - jiffies;
> > -   if ((long)timeout <= 0)
> > -   return -ETIME;
> > -   rc = wait_event_interruptible_timeout(*queue,
> > -   wait_for_tpm_stat_cond(chip, mask, check_cancel,
> > -  ),
> > -   timeout);
> > -   if (rc > 0) {
> > -   if (canceled)
> > -   return -ECANCELED;
> > -   return 0;
> > -   }
> > -   if (rc == -ERESTARTSYS && freezing(current)) {
> > -   clear_thread_flag(TIF_SIGPENDING);
> > -   goto again;
> > -   }
> > -   } else {
> > -   do {
> > -   tpm_msleep(TPM_TIMEOUT);
> > -   status = chip->ops->status(chip);
> > -   if ((status & mask) == mask)
> > -   return 0;
> > -   } while (time_before(jiffies, stop));
> > -   }
> > -   return -ETIME;
> > -}
> > -EXPORT_SYMBOL_GPL(wait_for_tpm_stat);
> > -
> >  #define TPM_ORD_SAVESTATE 152
> >  #define SAVESTATE_RESULT_SIZE 10
> >
> > diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
> > index 2d5466a72e40..4fc83ac7abeb 100644
> > --- a/drivers/char/tpm/tpm.h
> > +++ b/drivers/char/tpm/tpm.h
> > @@ -525,8 +525,6 @@ int tpm_do_selftest(struct tpm_chip *chip);
> >  unsigned long tpm_calc_ordinal_duration(struct tpm_chip *chip, u32
> ordinal);
> >  int tpm_pm_suspend(struct device *dev);
> >  int tpm_pm_resume(struct device *dev);
> > -int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask, unsigned long
> timeout,
> > - wait_queue_head_t *queue, bool check_cancel);
> >
> >  static inline void tpm_msleep(unsigned int delay_msec)
> >  {
> > diff --git a/drivers/char/tpm/tpm_tis_core.c
> b/drivers/char/tpm/tpm_tis_core.c
> > index 63bc6c3b949e..b33126a35694 100644
> > --- a/drivers/char/tpm/tpm_tis_core.c
> > +++ b/drivers/char/tpm/tpm_tis_core.c
> > @@ -31,6 +31,66 @@
> >  #include "tpm.h"
> >  #include "tpm_tis_core.h"
> >
> > +static bool wait_for_tpm_stat_cond(struct tpm_chip *chip, u8 mask,
> > +   bool check_cancel, bool *canceled)
> > +{
> > +   u8 status = chip->ops->status(chip);
> > +
> > +   *canceled = false;
> > +   if ((status & mask) == mask)
> > +   return true;
> > +   if (check_cancel && chip->ops->req_canceled(chip, status)) {
> > +   *canceled = true;
> > +   return true;
> > +   }
> > +   return false;
> > +}
> > +
> > +static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
> > +   unsigned long timeout, wait_queue_head_t *queue,
> > +   bool check_cancel)
> > +{
> > +   unsigned long stop;
> > +   long rc;
> > +   u8 status;
> > 

RE: [PATCH v4 2/4] tpm: ignore burstcount to improve tpm_tis send() performance

2017-10-20 Thread Alexander.Steffen
> The TPM burstcount status indicates the number of bytes that can
> be sent to the TPM without causing bus wait states.  Effectively,
> it is the number of empty bytes in the command FIFO.
> 
> This patch optimizes the tpm_tis_send_data() function by checking
> the burstcount only once. And if the burstcount is valid, it writes
> all the bytes at once, permitting wait state.
> 
> After this change, performance on a TPM 1.2 with an 8 byte
> burstcount for 1000 extends improved from ~41sec to ~14sec.
> 
> Suggested-by: Ken Goldman  in
> conjunction with the TPM Device Driver work group.
> Signed-off-by: Nayna Jain 
> Acked-by: Mimi Zohar 
> ---
>  drivers/char/tpm/tpm_tis_core.c | 42 +++--
> 
>  1 file changed, 15 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/char/tpm/tpm_tis_core.c
> b/drivers/char/tpm/tpm_tis_core.c
> index b33126a35694..993328ae988c 100644
> --- a/drivers/char/tpm/tpm_tis_core.c
> +++ b/drivers/char/tpm/tpm_tis_core.c
> @@ -316,7 +316,6 @@ static int tpm_tis_send_data(struct tpm_chip *chip,
> u8 *buf, size_t len)
>  {
>   struct tpm_tis_data *priv = dev_get_drvdata(>dev);
>   int rc, status, burstcnt;
> - size_t count = 0;
>   bool itpm = priv->flags & TPM_TIS_ITPM_WORKAROUND;
> 
>   status = tpm_tis_status(chip);
> @@ -330,35 +329,24 @@ static int tpm_tis_send_data(struct tpm_chip *chip,
> u8 *buf, size_t len)
>   }
>   }
> 
> - while (count < len - 1) {
> - burstcnt = get_burstcount(chip);
> - if (burstcnt < 0) {
> - dev_err(>dev, "Unable to read burstcount\n");
> - rc = burstcnt;
> - goto out_err;
> - }
> - burstcnt = min_t(int, burstcnt, len - count - 1);
> - rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv-
> >locality),
> -  burstcnt, buf + count);
> - if (rc < 0)
> - goto out_err;
> -
> - count += burstcnt;
> -
> - if (wait_for_tpm_stat(chip, TPM_STS_VALID, chip-
> >timeout_c,
> - >int_queue, false) < 0) {
> - rc = -ETIME;
> - goto out_err;
> - }
> - status = tpm_tis_status(chip);
> - if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0) {
> - rc = -EIO;
> - goto out_err;
> - }
> + /*
> +  * Get the initial burstcount to ensure TPM is ready to
> +  * accept data.
> +  */
> + burstcnt = get_burstcount(chip);
> + if (burstcnt < 0) {
> + dev_err(>dev, "Unable to read burstcount\n");
> + rc = burstcnt;
> + goto out_err;
>   }
> 
> + rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv->locality),
> + len - 1, buf);
> + if (rc < 0)
> + goto out_err;
> +
>   /* write last byte */
> - rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality),
> buf[count]);
> + rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality), buf[len-
> 1]);
>   if (rc < 0)
>   goto out_err;
> 
> --
> 2.13.3

This seems to fail reliably with my SPI TPM 2.0. I get EIO when trying to send 
large amounts of data, e.g. with TPM2_Hash, and subsequent tests seem to take 
an unusual amount of time. More analysis probably has to wait until November, 
since I am going to be in Prague next week.

Alexander


RE: Fixing CVE-2017-15361

2017-10-26 Thread Alexander.Steffen
> On Thu, Oct 26, 2017 at 12:26:10AM +0200, Peter Huewe wrote:
> >
> >
> > Am 25. Oktober 2017 20:53:49 MESZ schrieb Jarkko Sakkinen
> :
> > >On Wed, Oct 25, 2017 at 07:17:17AM -0700, Matthew Garrett wrote:
> > >> On Wed, Oct 25, 2017 at 6:44 AM, Jarkko Sakkinen
> > >>  wrote:
> > >> > I'm implementing a fix for CVE-2017-15361 that simply blacklists
> > >> > vulnerable FW versions. I think this is the only responsible action
> > >from
> > >> > my side that I can do.
> > >>
> > >> I'm not sure this is ideal - do Infineon have any Linux tooling for
> > >> performing firmware updates, and if so will that continue working if
> > >> the device is blacklisted? It's also a poor user experience to have
> > >> systems using TPM-backed disk encryption keys suddenly rendered
> > >> unbootable, and making it as easy as possible for people to do an
> > >> upgrade and then re-seal secrets with new keys feels like the correct
> > >> approach.
> > >
> > >I talked today with Alexander Steffen in the KS unconference and we
> > >concluded that this would be a terrible idea.
> > >
> > >Alexander stated the following things about FW updates (Alexander,
> > >please correct me if I state something incorrectly or if you have
> > >something to add):
> > >
> > >* FW update can be constructed either in a way that the keys in the
> > >  NVRAM are not cleared or in a way that they are cleared.
> > >* FW update cannot be directly applied to the TPM but must come as
> > >  part of the firmware update from the vendor.
> > >
> > >I proposed the following as an alternative:
> > >
> > >* Print a message to the klog (which log level would be appropriate?).
> > Info?
> > Maybe warn, definitely not err
> 
> Since the driver does not fail usually warn would make sense but since
> here even allowing to continue to use such TPM is questionable I would
> use error here.
> 
> People anyway ignore klog too easily so using warn would be a mistake in
> my opinion. It's like saying that nothing serious is happening here,
> move along.
> 
> Do you think so?
> 
> > >* Possibly sleep for few seconds. Is this a good idea?
> > Helps how?
> 
> Obviously to get it noticed that the system integrity is broken.
> 
> > >While writing this email yet another alternative popped into my mind:
> > >what if we allow only in-kernel use but disallow the use of /dev/tpm0?
> > >You could still use trusted keys.
> > >
> > No, same terrible idea since you block the upgrade path.
> > Upgrade tools work from userspace via the kernel driver.
> > So /dev/tpm0 is necessary.
> 
> Right! How stupid of me (my previous response to Jerry) :-) Of course you
> can have special commands and talk to the TPM to do the upgrade even if
> it is part of the platform and not connected to a standard bus.
> 
> I got understanding in the yesterdays unconfernce discussion that it
> should be part of the firmware upgrade.

Yes, but it really depends on the way the vendor chooses to do the upgrade. 
UEFI Capsules would be one standard way that does not involve the Linux driver. 
But maybe you are on some embedded ARM platform without UEFI, then you can also 
run the upgrade through /dev/tpm0, so that you do not need to invent another 
way to talk to the TPM.

This second option does have the drawback of the Linux driver not being aware 
of the upgrade happening. It does not know that while the TPM is in the upgrade 
mode no other commands can be executed. Neither does it know that after the 
upgrade the system needs to be rebooted before the TPM can be used again (so 
that for example the PCRs have the correct values again). I want to look into 
those issues in the future.

> How do you do the upgrade through /dev/tpm0?
> 
> /Jarkko

Alexander


RE: Fixing CVE-2017-15361

2017-10-26 Thread Alexander.Steffen
> On Wed, Oct 25, 2017 at 07:17:17AM -0700, Matthew Garrett wrote:
> > On Wed, Oct 25, 2017 at 6:44 AM, Jarkko Sakkinen
> >  wrote:
> > > I'm implementing a fix for CVE-2017-15361 that simply blacklists
> > > vulnerable FW versions. I think this is the only responsible action from
> > > my side that I can do.
> >
> > I'm not sure this is ideal - do Infineon have any Linux tooling for
> > performing firmware updates, and if so will that continue working if
> > the device is blacklisted? It's also a poor user experience to have
> > systems using TPM-backed disk encryption keys suddenly rendered
> > unbootable, and making it as easy as possible for people to do an
> > upgrade and then re-seal secrets with new keys feels like the correct
> > approach.
> 
> I talked today with Alexander Steffen in the KS unconference and we
> concluded that this would be a terrible idea.

Right. Thinking more about this issue, I'd say the ideal way to handle it would 
be in the applications using the TPM. Not all functionalities of the TPM are 
affected, so only the applications can know whether their use cases are still 
safe and it are the applications that need to migrate to a safe solution should 
this be necessary. (Of course, this puts the burden on each individual 
application instead of having one central place to decide what is safe and what 
isn't, but I do not see any way around that.)

As far as I know, the kernel itself is not using any of the affected 
functionalities, so there is no need for an immediate mitigation within the 
kernel. But I'd like to hear about how similar issues were handled in the past. 
I can think of multiple severe security issues, for example in BIOS 
implementations, but I cannot recall ever hearing about the kernel refusing to 
boot on such machines or even do so much as print a warning about that 
vulnerability.

> Alexander stated the following things about FW updates (Alexander,
> please correct me if I state something incorrectly or if you have
> something to add):
> 
> * FW update can be constructed either in a way that the keys in the
>   NVRAM are not cleared or in a way that they are cleared.

Correct. But as far as I know, the updates that were already published for this 
issue do not delete any of the keys. And I do not think that this would be a 
good idea. After all, the applications still might need access to their key to 
decrypt their data and reencrypt it with a safe key after applying the update.

> * FW update cannot be directly applied to the TPM but must come as
>   part of the firmware update from the vendor.

Yes, starting the upgrade process is guarded by platformAuth/platformPolicy (in 
the case of TPM2), so the platform vendor needs to be involved. And you want 
them to be involved, since they need to make sure that their system still works 
with the updated TPM. I'm not sure whether platform vendors do that for TPMs, 
but for wireless cards whitelisting in the BIOS is common, and you do not want 
your machine refusing to boot just because the BIOS does not recognize your 
TPM's firmware version anymore (as a simple example).

> I proposed the following as an alternative:
> 
> * Print a message to the klog (which log level would be appropriate?).
> * Possibly sleep for few seconds. Is this a good idea?

I'd be okay with that, but ideally we'd have some kind of agreement/history of 
how to handle similar security issues in hardware components in general. 
Implementing a special case just for this TPM vulnerability does not seem like 
the right thing to do, especially when the kernel itself is not affected (and 
thus the whole machine might not be affected for the way that it is used). We 
do not want to confuse users or make them expect similar warnings in the 
future, when we might have no intention of providing them.

> While writing this email yet another alternative popped into my mind:
> what if we allow only in-kernel use but disallow the use of /dev/tpm0?
> You could still use trusted keys.
> 
> Here are all the ideas that I have and I am open for better
> alternatives.
> 
> /Jarkko

Alexander


RE: Fixing CVE-2017-15361

2017-10-26 Thread Alexander.Steffen
> > On Wed, Oct 25, 2017 at 07:17:17AM -0700, Matthew Garrett wrote:
> > > On Wed, Oct 25, 2017 at 6:44 AM, Jarkko Sakkinen
> > >  wrote:
> > > > I'm implementing a fix for CVE-2017-15361 that simply blacklists
> > > > vulnerable FW versions. I think this is the only responsible action from
> > > > my side that I can do.
> > >
> > > I'm not sure this is ideal - do Infineon have any Linux tooling for
> > > performing firmware updates, and if so will that continue working if
> > > the device is blacklisted? It's also a poor user experience to have
> > > systems using TPM-backed disk encryption keys suddenly rendered
> > > unbootable, and making it as easy as possible for people to do an
> > > upgrade and then re-seal secrets with new keys feels like the correct
> > > approach.
> >
> > I talked today with Alexander Steffen in the KS unconference and we
> > concluded that this would be a terrible idea.
> 
> Right. Thinking more about this issue, I'd say the ideal way to handle it 
> would
> be in the applications using the TPM. Not all functionalities of the TPM are
> affected, so only the applications can know whether their use cases are still
> safe and it are the applications that need to migrate to a safe solution 
> should
> this be necessary. (Of course, this puts the burden on each individual
> application instead of having one central place to decide what is safe and
> what isn't, but I do not see any way around that.)
> 
> As far as I know, the kernel itself is not using any of the affected
> functionalities, so there is no need for an immediate mitigation within the
> kernel.

This does not seem to be entirely true. The code in security/keys/trusted.c 
might be affected under some circumstances, but I am not familiar enough with 
that code to say for sure. As far as I can tell, it does not create any RSA 
keys directly, but it might reference them as parent keys, so that vulnerable 
keys are used to protect the secrets. 

If that is the case, can we detect that issue and migrate the secrets to a safe 
configuration? Do we (as the kernel) want to do that? How much do we want/have 
to involve the user in the process?

> But I'd like to hear about how similar issues were handled in the past.
> I can think of multiple severe security issues, for example in BIOS
> implementations, but I cannot recall ever hearing about the kernel refusing
> to boot on such machines or even do so much as print a warning about that
> vulnerability.
> 
> > Alexander stated the following things about FW updates (Alexander,
> > please correct me if I state something incorrectly or if you have
> > something to add):
> >
> > * FW update can be constructed either in a way that the keys in the
> >   NVRAM are not cleared or in a way that they are cleared.
> 
> Correct. But as far as I know, the updates that were already published for 
> this
> issue do not delete any of the keys. And I do not think that this would be a
> good idea. After all, the applications still might need access to their key to
> decrypt their data and reencrypt it with a safe key after applying the update.
> 
> > * FW update cannot be directly applied to the TPM but must come as
> >   part of the firmware update from the vendor.
> 
> Yes, starting the upgrade process is guarded by platformAuth/platformPolicy
> (in the case of TPM2), so the platform vendor needs to be involved. And you
> want them to be involved, since they need to make sure that their system
> still works with the updated TPM. I'm not sure whether platform vendors do
> that for TPMs, but for wireless cards whitelisting in the BIOS is common, and
> you do not want your machine refusing to boot just because the BIOS does
> not recognize your TPM's firmware version anymore (as a simple example).
> 
> > I proposed the following as an alternative:
> >
> > * Print a message to the klog (which log level would be appropriate?).
> > * Possibly sleep for few seconds. Is this a good idea?
> 
> I'd be okay with that, but ideally we'd have some kind of agreement/history
> of how to handle similar security issues in hardware components in general.
> Implementing a special case just for this TPM vulnerability does not seem like
> the right thing to do, especially when the kernel itself is not affected (and
> thus the whole machine might not be affected for the way that it is used).
> We do not want to confuse users or make them expect similar warnings in
> the future, when we might have no intention of providing them.
> 
> > While writing this email yet another alternative popped into my mind:
> > what if we allow only in-kernel use but disallow the use of /dev/tpm0?
> > You could still use trusted keys.
> >
> > Here are all the ideas that I have and I am open for better
> > alternatives.
> >
> > /Jarkko
> 
> Alexander


RE: TPM driver breaks S3 suspend

2017-12-22 Thread Alexander.Steffen
> On Thu, Dec 21, 2017 at 6:19 PM,   wrote:
> >> Hi,
> >> We have a desktop which has S3 suspend (to RAM) problem due to
> >> error messages as follows.
> >> [  198.908282] tpm tpm0: Error (38) sending savestate before suspend
> >> [  198.908289] __pnp_bus_suspend(): tpm_pm_suspend+0x0/0x160
> returns
> >> 38
> >> [  198.908293] dpm_run_callback(): pnp_bus_suspend+0x0/0x20 returns
> 38
> >> [  198.908298] PM: Device 00:0b failed to suspend: error 38
> >>
> >> However, the first suspend after boot is working although it still
> >> shows an interesting message during resume.
> >> [  155.789945] tpm tpm0: A TPM error (38) occurred continue selftest
> >>
> >> The error code 38 in definition is TPM_ERR_INVALID_POSTINIT. I
> >> found some explanations which said this error code means that this
> >> command was received in the wrong sequence relative to a TPM_Startup
> >> command. Don't really know what happens here and how should I deal
> >> with this? Any suggestions? Please let me know what else information
> >> should I provide. Thanks
> >>
> >> Chris
> >
> > Just from looking at the code, this seems to be an issue in tpm_tis_resume.
> >
> > When the device is not a TPM 2.0, it tries to execute the selftests, but
> ignores the results. In your case the selftests fail during resume, but since
> the error is ignored, the TPM device is still present (though non-functional)
> and so breaks the subsequent suspend.
> >
> > In addition, from the error code we can tell that it is not actually a 
> > selftest
> failure. INVALID_POSTINIT for a command other than TPM_Startup means
> that no TPM_Startup has been executed for that power cycle yet, so the
> TPM has to reject all other commands. Usually, the platform sends the
> TPM_Startup command, but not in your case apparently.
> >
> > The correct solution should be something like tpm2_auto_startup (execute
> selftests, if they fail because of the missing startup command, execute that
> and retry the selftests). Interestingly, tpm1_auto_startup (same purpose as
> tpm2_auto_startup, but for TPM 1.2 instead) does not use the same
> sequence, the startup-retry part is missing. Is there any reason this is done
> differently for TPM 1.2? Otherwise I'd propose to make tpm1_auto_startup
> follow the same sequence as tpm2_auto_startup and then call both from
> tpm_tis_resume, similar to what tpm_chip_register does.
> >
> > Alexander
> 
> You mean do tpm1(or 2)_auto_startup when I fail selftest with error
> code 38? Then it should retry until the TPM state back to correct
> state?

Yes, but that will only help once we've taught tpm1_auto_startup to handle 
error code 38 similar to tpm2_auto_startup.

But you can try whether that approach solves your problem without rebuilding 
the kernel, by sending the TPM_Startup command from user space:

python3 -c 'f=open("/dev/tpm0", "r+b", buffering=0); 
f.write(b"\x00\xc1\x00\x00\x00\x0c\x00\x00\x00\x99\x00\x01"
); print(f.readall())'

Could you try the following sequence?
1. Boot your system.
2. Suspend and resume your system. 
3. Send TPM_Startup manually.
4. Go to step 2.

If my theory is correct, the TPM should no longer fail during suspend, though 
you'll still get the same error message when resuming.

Alexander


RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`

2017-12-22 Thread Alexander.Steffen
> Hi Paul,
> 
> On Mon, 2017-12-11 at 13:54 +0100, Paul Menzel wrote:
> > Dear Jason,
> >
> >
> > On 12/08/17 17:18, Jason Gunthorpe wrote:
> > > On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
> > >
> > >> I have no access to the system right now, but want to point out, that the
> > >> log was created by `journactl -k`, so I do not know if that messes with
> the
> > >> time stamps. I checked the output of `dmesg` but didn’t see the TPM
> error
> > >> messages in the output – only `tpm_tis MSFT0101:00: 2.0 TPM (device-id
> 0xFE,
> > >> rev-id 4)`. Do I need to pass a different error message to `dmesg`?
> > >
> > > It is a good question, I don't know.. If your kernel isn't setup to
> > > timestamp messages then the journalstamp will certainly be garbage.
> > >
> > > No idea why you wouldn't see the messages in dmesg, if they are not in
> > > dmesg they couldn't get into the journal
> >
> > It looks like I was running an older Linux kernel version, when running
> > `dmesg`. Sorry for the noise. Here are the messages with the Linux
> > kernel time stamps, showing that the delays work correctly.
> >
> > ```
> > $ uname -a
> > Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3
> > 17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> > $ sudo dmesg | grep TPM
> > [0.00] ACPI: TPM2 0x6F332168 34 (v03Tpm2Tabl
> > 0001 AMI  )
> > [1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> > [1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [3.734808] tpm tpm0: TPM self test failed
> > [3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
> > ```
> 
> I've sort of been following this thread, but just want to make sure
> that once the self test is/was fixed, that you aren't seeing the IMA
> message.
> 
> Assuming this is fixed, could someone provide the commit that fixes
> it?

I don't think we've found a solution yet. There might be a firmware upgrade 
that changes that TPM's behavior. Or maybe my latest patch helps? 
https://patchwork.kernel.org/patch/10130535/

Alexander


RE: [PATCH 3/4] char/tpm: Improve a size determination in nine functions

2017-10-17 Thread Alexander.Steffen
> > Replace the specification of data structures by pointer dereferences
> > as the parameter for the operator "sizeof" to make the corresponding
> > size
> > determination a bit safer according to the Linux coding style
> > convention.
> 
> 
> This patch does one style in favor of the other.

I actually prefer that style, so I'd welcome this change :)

> At the end it's Jarkko's call, though I would NAK this as I think some
> one already told this to you for some other similar patch(es).
> 
> 
> I even would suggest to stop doing this noisy stuff, which keeps people
> busy for nothing.

Cleaning up old code is also worth something, even if does not change one bit 
in the assembly output in the end...

Alexander


RE: [PATCH 3/4] char/tpm: Improve a size determination in nine functions

2017-10-17 Thread Alexander.Steffen
> On Tue, 2017-10-17 at 11:50 +, alexander.stef...@infineon.com
> wrote:
> > > > Replace the specification of data structures by pointer dereferences
> > > > as the parameter for the operator "sizeof" to make the corresponding
> > > > size
> > > > determination a bit safer according to the Linux coding style
> > > > convention.
> > >
> > >
> > > This patch does one style in favor of the other.
> >
> > I actually prefer that style, so I'd welcome this change :)
> 
> Style changes should be reviewed and documented, like any other code
> change, and added to Documentation/process/coding-style.rst or an
> equivalent file.
> 
> > > At the end it's Jarkko's call, though I would NAK this as I think some
> > > one already told this to you for some other similar patch(es).
> > >
> > >
> > > I even would suggest to stop doing this noisy stuff, which keeps people
> > > busy for nothing.
> >
> > Cleaning up old code is also worth something, even if does not
> > change one bit in the assembly output in the end...
> 
> Wow, you're opening the door really wide for all sorts of trivial
> changes!  Hope you have the time and inclination to review and comment
> on all of them.  I certainly don't.

Well, isn't the point of trivial changes that they are trivial to review? :) 
For things like that there is probably not even a need to run a test, though 
with sufficient automation that should not be a problem either.

> There is a major difference between adding these sorts of checks to
> the tools in the scripts directory or even to the zero day bots that
> catch different sorts of errors, BEFORE code is upstreamed, and
> patches like these, after the fact.

Catching those things early in the process is certainly preferable. But at some 
point you need to fix the existing code, or you'll end up with a mashup of 
different styles, just because you did not want to touch old code.

> After the code has been upstreamed, it is a lot more difficult to
> justify changes like this.  It impacts both code that is being
> developed AND backporting bug fixes.

Backporting could be an argument, but even that should not be allowed to block 
improvements indefinitely. I'd prefer a world in which the current code is nice 
and clean and easy to maintain, to a world where we never touch old code unless 
it is proven to be wrong.

But looking at the code in question, I cannot see how this should ever be a 
serious problem. Even when backporting a change takes now ten minutes instead 
of five, which means it is twice as hard, it is still not difficult.

Alexander


RE: [PATCH 3/4] char/tpm: Improve a size determination in nine functions

2017-10-18 Thread Alexander.Steffen
> On Tue, 17 Oct 2017, Mimi Zohar wrote:
> 
> > On Tue, 2017-10-17 at 14:58 +0200, Julia Lawall wrote:
> > >
> > > On Tue, 17 Oct 2017, Mimi Zohar wrote:
> > >
> > > > On Tue, 2017-10-17 at 11:50 +, alexander.stef...@infineon.com
> > > > wrote:
> > > > > > > Replace the specification of data structures by pointer
> dereferences
> > > > > > > as the parameter for the operator "sizeof" to make the
> corresponding
> > > > > > > size
> > > > > > > determination a bit safer according to the Linux coding style
> > > > > > > convention.
> > > > > >
> > > > > >
> > > > > > This patch does one style in favor of the other.
> > > > >
> > > > > I actually prefer that style, so I'd welcome this change :)
> > > >
> > > > Style changes should be reviewed and documented, like any other
> code
> > > > change, and added to Documentation/process/coding-style.rst or an
> > > > equivalent file.
> > >
> > > Actually, it has been there for many years:
> > >
> > > 14) Allocating memory
> > > -
> > > ...
> > > The preferred form for passing a size of a struct is the following:
> > >
> > > .. code-block:: c
> > >
> > >   p = kmalloc(sizeof(*p), ...);
> > >
> > > The alternative form where struct name is spelled out hurts readability
> and
> > > introduces an opportunity for a bug when the pointer variable type is
> changed
> > > but the corresponding sizeof that is passed to a memory allocator is not.
> >
> > True, thanks for the reminder.  Is this common in new code?  Is there
> > a script/ or some other automated way of catching this usage before
> > patches are upstreamed?
> >
> > Just as you're doing here, the patch description should reference this
> > in the patch description.
> 
> The comment in the documentation seems have been there since Linux
> 2.6.14,
> ie 2005.  The fact that a lot of code still doesn't use that style, 12
> years later, suggests that actually it is not preferred, or not preferred
> by everyone.  Perhaps the paragraph in coding style should just be
> dropped.

Or maybe everyone just copied existing code, which did not follow that pattern, 
because nobody bothered to fix old code ;-)

(This is true at least for tpm_tis_spi, where I was involved in its creation.)

Alexander


RE: char-TPM: Adjustments for ten function implementations

2017-10-18 Thread Alexander.Steffen
> On Wed, 2017-10-18 at 11:00 +0200, SF Markus Elfring wrote:
> > > The printk removals do change the objects.
> > >
> > > The value of that type of change is only for resource limited systems.
> >
> > I imagine that such small code adjustments are also useful for other
> systems.
> 
> Your imagination and mine differ.
> Where do you _think_ it matters?
> 
> For instance, nothing about
> 
>   sizeof(type)
> vs
>   sizeof(*ptr)
> 
> makes it easier for a human to read the code.

If it does not make it easier to read the code for you, then maybe you should 
consider that this might not be true for all humans. For me, it makes it much 
easier to see at a glance, that code like ptr=malloc(sizeof(*ptr)) is correct.

Alexander


RE: char-TPM: Adjustments for ten function implementations

2017-10-18 Thread Alexander.Steffen
> On Wed, 18 Oct 2017, alexander.stef...@infineon.com wrote:
> 
> > > On Wed, 2017-10-18 at 11:00 +0200, SF Markus Elfring wrote:
> > > > > The printk removals do change the objects.
> > > > >
> > > > > The value of that type of change is only for resource limited systems.
> > > >
> > > > I imagine that such small code adjustments are also useful for other
> > > systems.
> > >
> > > Your imagination and mine differ.
> > > Where do you _think_ it matters?
> > >
> > > For instance, nothing about
> > >
> > >   sizeof(type)
> > > vs
> > >   sizeof(*ptr)
> > >
> > > makes it easier for a human to read the code.
> >
> > If it does not make it easier to read the code for you, then maybe you
> > should consider that this might not be true for all humans. For me, it
> > makes it much easier to see at a glance, that code like
> > ptr=malloc(sizeof(*ptr)) is correct.
> 
> I don't think there is a perfect solution.

Maybe. But for the second variant the correctness is easier to check, both 
mentally and programmatically, because there is no need for any context (the 
type of ptr does not matter).

> The type argument to sizeof
> could have the wrong type.  The expression argument to sizeof could be
> missing the *.  Unpleasant consequences are possible in both cases.
> Probably each maintainer has a style they prefer.  Perhaps it could be
> useful to adjust the code to follow the dominant strategy, in cases where
> there are a inconsistencies.

Certainly. At least within a file, there should be only one style.

> For example
> 
> if (...)
>   x = foo1(sizeof(struct xtype));
> else
>   x = foo2(sizeof(*x));
> 
> might at least cause some unnecessary mental effort to process.
> 
> julia

Alexander


RE: char-TPM: Adjustments for ten function implementations

2017-10-18 Thread Alexander.Steffen
> On Wed, 2017-10-18 at 10:44 +, alexander.stef...@infineon.com wrote:
> > > For instance, nothing about
> > > > >   sizeof(type)
> > > > > vs
> > > > >   sizeof(*ptr)
> > > > > makes it easier for a human to read the code.
> > > >
> > > > If it does not make it easier to read the code for you, then maybe you
> > > > should consider that this might not be true for all humans. For me, it
> > > > makes it much easier to see at a glance, that code like
> > > > ptr=malloc(sizeof(*ptr)) is correct.
> > >
> > > I don't think there is a perfect solution.
> >
> > Maybe. But for the second variant the correctness is easier to check,
> 
> How often should
>   ptr = alloc(sizeof(*ptr))
> be
>   ptr = alloc(sizeof(**ptr))

Never? Because in that case it probably should be *ptr=alloc(sizeof(**ptr)), 
unless you are doing something horrible ;-)

Alexander


RE: [PATCH 0/4] tpm: fix PS/2 devices not working on Braswell systems due CLKRUN enabled

2017-12-20 Thread Alexander.Steffen
> Hi Hans,
> 
> Thanks a lot for your feedback.
> 
> On 12/20/2017 12:43 PM, Hans de Goede wrote:
> > Hi,
> >
> > On 20-12-17 12:35, Javier Martinez Canillas wrote:
> >> Hello,
> >>
> >> Commit 5e572cab92f0 ("tpm: Enable CLKRUN protocol for Braswell
> systems")
> >> added logic in the TPM TIS driver to disable the Low Pin Count CLKRUN
> >> signal during TPM transactions.
> >>
> >> Unfortunately this breaks other devices that are attached to the LPC bus
> >> like for example PS/2 mouse and keyboards.
> >>
> >> The bug was reported to the Fedora kernel [0] and the kernel bugzilla [1].
> >> This issue and the propossed solution were discussed in this [2] thread,
> >> and the reporter (Jame Ettle) confirmed that his system works again after
> >> the fix in this series.
> >>
> >> The patches are based on top or Jarkko Sakkinen's linux-tpmdd [3] tree.
> >>
> >> James,
> >>
> >> Even when there shouldn't be any functional changes, I included some
> other
> >> fixes / cleanups in this series so it would be great if you can test them
> >> again. I can't because I don't have access to a machine affected by this.
> >>
> >> [0]: https://bugzilla.redhat.com/show_bug.cgi?id=1498987
> >> [1]: https://bugzilla.kernel.org/show_bug.cgi?id=197287
> >> [2]: https://patchwork.kernel.org/patch/10119417/
> >> [3]: git.infradead.org/users/jjs/linux-tpmdd.git
> >>
> >> Best regards,
> >> Javier
> >>
> >>
> >> Javier Martinez Canillas (4):
> >>tpm: fix access attempt to an already unmapped I/O memory region
> >>tpm: delete the TPM_TIS_CLK_ENABLE flag
> >>tpm: follow coding style for variable declaration in
> >>  tpm_tis_core_init()
> >>tpm: only attempt to disable the LPC CLKRUN if is already enabled
> >>
> >>   drivers/char/tpm/tpm_tis.c  | 17 +
> >>   drivers/char/tpm/tpm_tis_core.c | 24 +---
> >>   drivers/char/tpm/tpm_tis_core.h |  1 -
> >>   3 files changed, 18 insertions(+), 24 deletions(-)
> >
> > Note I'm just reading along here, but I'm wondering if both the TPM
> > and now also some PS/2 controllers need CLK_RUN to be disabled,
> > why don't we just disable it once permanently and be done with it?
> >
> 
> That's the same question I asked to Azhar who authored the patch that
> added the CLKRUN toggle logic to the driver, but he didn't answer yet:
> 
> https://www.spinics.net/lists/linux-integrity/msg01107.html
> 
> My understanding is that by permanently disabling it, then other devices
> that do support the CLKRUN protocol will continue to work correctly since
> the CLKRUN# signal is optional and only used if the host LCLK is stopped.
> 
> The disadvantage though will be that the power management feature of
> stopping
> the LPC host LCLK clock when entering into low-power states will not be
> used.
> 
> > It seems that on machines with a PS/2 controller connected to
> > the LPC bus the BIOS is already doing this, so I've a feeling that
> > it not being done on devices with a TPM is a bug in the firmware
> 
> Absolutely agree, system integratos should make sure that all the devices
> connected to the LPC either have CLKRUN protocol support and is enabled
> or disable the CLKRUN protocol permanently.

As far as I understand it, this is exactly the issue here: They know that there 
are devices that do not support the CLKRUN protocol (the TPM in this case), but 
they still need to enable it to prevent other issues. So for the TPM to 
continue to work, CLKRUN needs to be disabled temporarily while the TPM is 
active.

> > there and we should just disable it everywhere (and probably
> > find a better place then the TPM driver to do the disabling).
> >
> 
> Indeed. Touching a global bus configuration in a driver for a single device
> isn't safe to say the least. One problem is that we don't have a LPC bus
> subsystem so that's why these sort of quirks are done at the driver level.
> 
> > Note this is just an observation, I could be completely wrong here,
> > but I've a feeling that just disabling CLKRUN all together is the
> > right thing to do and that seems like an easier fix to me.
> >
> 
> I think your observation is correct. The only problem is the power
> management
> feature being disabled as mentioned. Although as you said it seems that
> most
> BIOS do that (as shown by the patch I posted that just avoids toggling the
> CLKRUN state if it's already disabled).
> 
> > Specifically what worries me is: what if another driver also takes
> > the temporarily disable CLK_RUN approach because of similar issues?
> > Now we've 2 drivers playing with the CLKRUN state and racing with
> > each others.
> >
> 
> Agreed. You don't even need another driver, if a Braswell system has 2 TPMs
> then you have a race condition and one driver could enable the CLKRUN
> state
> while the other driver thinks that's already disabled, and TPM transactions
> won't work in that case.

Even though it is an unusual configuration to have multiple TPMs in a system, 
this is something that we 

RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`

2017-12-14 Thread Alexander.Steffen
> [Mario from Dell added to CC list.]
> 
> Dear Alexander,
> 
> 
> On 12/11/17 17:08, alexander.stef...@infineon.com wrote:
> 
> >> On 12/08/17 17:18, Jason Gunthorpe wrote:
> >>> On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
> >>>
>  I have no access to the system right now, but want to point out, that
> the
>  log was created by `journactl -k`, so I do not know if that messes with
> the
>  time stamps. I checked the output of `dmesg` but didn’t see the TPM
> error
>  messages in the output – only `tpm_tis MSFT0101:00: 2.0 TPM (device-
> id 0xFE,
>  rev-id 4)`. Do I need to pass a different error message to `dmesg`?
> >>>
> >>> It is a good question, I don't know.. If your kernel isn't setup to
> >>> timestamp messages then the journalstamp will certainly be garbage.
> >>>
> >>> No idea why you wouldn't see the messages in dmesg, if they are not in
> >>> dmesg they couldn't get into the journal
> >>
> >> It looks like I was running an older Linux kernel version, when running
> >> `dmesg`. Sorry for the noise. Here are the messages with the Linux
> >> kernel time stamps, showing that the delays work correctly.
> >>
> >> ```
> >> $ uname -a
> >> Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3
> >> 17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> >> $ sudo dmesg | grep TPM
> >> [0.00] ACPI: TPM2 0x6F332168 34 (v03Tpm2Tabl
> >> 0001 AMI  )
> >> [1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> >> [1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [3.734808] tpm tpm0: TPM self test failed
> >> [3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
> >> ```
> >
> > Thanks for the fixed log. So your TPM seems to be rather slow with
> executing the selftests. Could try to apply the patch that I've just sent 
> you? It
> ensures that your TPM gets more time to execute all the tests, up to the limit
> set in the PTP.
> 
> Thank you for your patch. Judging from the time stamps, it seems it
> works, but the TPM still fails.
> 
> ```
> $ dmesg | grep tpm
> [1.100958] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> [1.111768] tpm tpm0: A TPM error (2314) occurred continue selftest
> [1.143020] tpm tpm0: A TPM error (2314) occurred continue selftest
> [1.194251] tpm tpm0: A TPM error (2314) occurred continue selftest
> [1.285509] tpm tpm0: A TPM error (2314) occurred continue selftest
> [1.457103] tpm tpm0: A TPM error (2314) occurred continue selftest
> [1.788709] tpm tpm0: A TPM error (2314) occurred continue selftest
> [2.440216] tpm tpm0: A TPM error (2314) occurred continue selftest
> [3.731704] tpm tpm0: A TPM error (2314) occurred continue selftest
> [6.303216] tpm tpm0: A TPM error (2314) occurred continue selftest
> [6.303242] tpm tpm0: TPM self test failed
> ```
> 
> To be clear, this issue is not reproducible during every start. (But
> that was the same before.)

Thanks for testing. Now you are in the unlucky situation that your TPM was 
probably always broken, but old kernels did not detect that and used it anyway.

To add some more details to what the problem is: The PTP limits the maximum 
runtime of the TPM2_SelfTest command that we try to execute here to 2000ms (see 
https://trustedcomputinggroup.org/wp-content/uploads/TCG_PC_Client_Platform_TPM_Profile_PTP_Specification_Family_2.0_Revision_1.3v22.pdf
 table 15, page 65 in the PDF, page 57 according to the printed page numbers). 
Technically, we have no evidence that your TPM is in violation of that 
specification, because it does reply to the command within 2000ms, it just has 
not completed the selftests within that timeframe. But clearly the intention of 
the specification authors was to have the selftests completed within that 
limit, there is no sense in allowing 2s just for the TPM to generate an answer 
without actually making any progress.

The TPM2_SelfTest command is special in that it is allowed to either execute 
all selftests and then return TPM_RC_SUCCESS or just schedule the selftest 
execution in the background and return TPM_RC_TESTING immediately (see 
https://trustedcomputinggroup.org/wp-content/uploads/TPM-Rev-2.0-Part-3-Commands-01.38.pdf
 chapter 10.2.1, page 43/29). Your TPM apparently chooses the second option, 
but (sometimes?) fails to complete the selftests within the limit that we set 
(which is far longer than the 2s from the PTP).

I'm not sure what to do about 

RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`

2017-12-14 Thread Alexander.Steffen
> > -Original Message-
> > From: alexander.stef...@infineon.com
> [mailto:alexander.stef...@infineon.com]
> > Sent: Thursday, December 14, 2017 6:21 AM
> > To: pmen...@molgen.mpg.de; j...@ziepe.ca
> > Cc: linux-integr...@vger.kernel.org; linux-kernel@vger.kernel.org;
> Limonciello,
> > Mario 
> > Subject: RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error
> (2314)
> > occurred continue selftest`
> >
> > > [Mario from Dell added to CC list.]
> > >
> > > Dear Alexander,
> > >
> > >
> > > On 12/11/17 17:08, alexander.stef...@infineon.com wrote:
> > >
> > > >> On 12/08/17 17:18, Jason Gunthorpe wrote:
> > > >>> On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
> > > >>>
> > >  I have no access to the system right now, but want to point out,
> that
> > > the
> > >  log was created by `journactl -k`, so I do not know if that messes
> with
> > > the
> > >  time stamps. I checked the output of `dmesg` but didn’t see the
> TPM
> > > error
> > >  messages in the output – only `tpm_tis MSFT0101:00: 2.0 TPM
> (device-
> > > id 0xFE,
> > >  rev-id 4)`. Do I need to pass a different error message to `dmesg`?
> > > >>>
> > > >>> It is a good question, I don't know.. If your kernel isn't setup to
> > > >>> timestamp messages then the journalstamp will certainly be
> garbage.
> > > >>>
> > > >>> No idea why you wouldn't see the messages in dmesg, if they are
> not in
> > > >>> dmesg they couldn't get into the journal
> > > >>
> > > >> It looks like I was running an older Linux kernel version, when running
> > > >> `dmesg`. Sorry for the noise. Here are the messages with the Linux
> > > >> kernel time stamps, showing that the delays work correctly.
> > > >>
> > > >> ```
> > > >> $ uname -a
> > > >> Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3
> > > >> 17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> > > >> $ sudo dmesg | grep TPM
> > > >> [0.00] ACPI: TPM2 0x6F332168 34 (v03
> Tpm2Tabl
> > > >> 0001 AMI  )
> > > >> [1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> > > >> [1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [3.734808] tpm tpm0: TPM self test failed
> > > >> [3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
> > > >> ```
> > > >
> > > > Thanks for the fixed log. So your TPM seems to be rather slow with
> > > executing the selftests. Could try to apply the patch that I've just sent
> you? It
> > > ensures that your TPM gets more time to execute all the tests, up to the
> limit
> > > set in the PTP.
> > >
> > > Thank you for your patch. Judging from the time stamps, it seems it
> > > works, but the TPM still fails.
> > >
> > > ```
> > > $ dmesg | grep tpm
> > > [1.100958] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> > > [1.111768] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [1.143020] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [1.194251] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [1.285509] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [1.457103] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [1.788709] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [2.440216] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [3.731704] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [6.303216] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [6.303242] tpm tpm0: TPM self test failed
> > > ```
> > >
> > > To be clear, this issue is not reproducible during every start. (But
> > > that was the same before.)
> >
> > Thanks for testing. Now you are in the unlucky situation that your TPM was
> > probably always broken, but old kernels did not detect that and used it
> anyway.
> >
> Something that Paul can consider is to upgrade the TPM firmware if it's not
> already
> upgraded.  Since the launch of XPS 9360 there was at least one TPM firmware
> update
> issued.  It has been posted to LVFS and can be upgraded using
> fwupd/fwupdate.
> Note: If your TPM is currently owned you will need to go into BIOS setup to
> clear it
> first before upgrading.

I'm not familiar with the specific TPM in your model, but according to the log 
it is a TPM 2.0, which does not really carry over the owner concept of a TPM 
1.2. Is clearing it still necessary for an upgrade then?

> I don’t have any insight into 

RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`

2017-12-15 Thread Alexander.Steffen
> [Adding Rafael and Len as they, to my knowledge, also use or have a
> access to a Dell XPS 13 9360. With latest Linux master do you get TPM
> self-test errors, when cold starting the system without the power supply
> plugged in?]
> 
> Dear Mario, dear Alexander,
> 
> 
> the added line breaks to the quoted parts really mess up the citation.
> Can we please try to use MUAs avoiding that, or fixing that manually?

Sorry, I'm not sure whether my company has a way for me to avoid using Outlook 
;-) But if there are any configuration changes to make it behave better, I will 
gladly apply them. Do you know of any documentation on this? All I found so far 
either is already applied or was outdated.

I'll remove some of the less relevant quoted parts, so that this is less of an 
issue.

> > To be clear, this issue is not reproducible during every start. (But
> > that was the same before.)
> 
> I think I found out how to reproduce the issue. Cold start the system
> without the power supply connected.
> 
>  Thanks for testing. Now you are in the unlucky situation that your TPM
> was
>  probably always broken, but old kernels did not detect that and used it
> anyway.
> 
> Just to clarify, I do not know if the TPM could ever be used. I believe
> the module loaded but the user space tools (tpm2_version or so) always
> returned an error in my tests.

Interesting. So maybe it is not a bug in your TPM's firmware, but really a 
single defective TPM? Can you try to figure that out? That is, when using an 
older kernel in the cold start scenario, can you execute any useful commands on 
your TPM successfully?

> >>> Something that Paul can consider is to upgrade the TPM firmware if it's
> not
> >>> already
> >>> upgraded.  Since the launch of XPS 9360 there was at least one TPM
> firmware
> >>> update
> >>> issued.  It has been posted to LVFS and can be upgraded using
> >>> fwupd/fwupdate.
> >>> Note: If your TPM is currently owned you will need to go into BIOS setup
> to
> >>> clear it
> >>> first before upgrading.
> >>
> >> I'm not familiar with the specific TPM in your model, but according to the
> log it is a
> >> TPM 2.0, which does not really carry over the owner concept of a TPM 1.2.
> Is
> >> clearing it still necessary for an upgrade then?
> >
> > Yes it's required for the TPM model/vendor that is used in the XPS model
> that
> > Paul has.  If you try to run the upgrade without clearing it the firmware 
> > will
> > reject the upgrade.
> 
> Mario, thank you for your quick reaction.
> 
> […]
> 
> 1.  Can you reproduce this issue too?
> 2.  How do I find out, what TPM firmware version is installed?

If you get the driver loaded, you can ask the TPM (TPM2_GetCapability for 
TPM_PT_FIRMWARE_VERSION_1 and TPM_PT_FIRMWARE_VERSION_2):

python3 -c 'f=open("/dev/tpm0", "r+b", buffering=0); 
f.write(b"\x80\x01\x00\x00\x00\x16\x00\x00\x01z\x00\x00\x00\x06\x00\x00\x01\x0b\x00\x00\x00\x02");
 print(f.readall())'

> 3.  Updating to the firmware 2.4.2 from December 17th, 2017 didn’t fix
> the issue.

You've got a firmware from the future? ;-)

Alexander


RE: TPM driver breaks S3 suspend

2017-12-21 Thread Alexander.Steffen
> Hi,
> We have a desktop which has S3 suspend (to RAM) problem due to
> error messages as follows.
> [  198.908282] tpm tpm0: Error (38) sending savestate before suspend
> [  198.908289] __pnp_bus_suspend(): tpm_pm_suspend+0x0/0x160 returns
> 38
> [  198.908293] dpm_run_callback(): pnp_bus_suspend+0x0/0x20 returns 38
> [  198.908298] PM: Device 00:0b failed to suspend: error 38
> 
> However, the first suspend after boot is working although it still
> shows an interesting message during resume.
> [  155.789945] tpm tpm0: A TPM error (38) occurred continue selftest
> 
> The error code 38 in definition is TPM_ERR_INVALID_POSTINIT. I
> found some explanations which said this error code means that this
> command was received in the wrong sequence relative to a TPM_Startup
> command. Don't really know what happens here and how should I deal
> with this? Any suggestions? Please let me know what else information
> should I provide. Thanks
> 
> Chris

Just from looking at the code, this seems to be an issue in tpm_tis_resume.

When the device is not a TPM 2.0, it tries to execute the selftests, but 
ignores the results. In your case the selftests fail during resume, but since 
the error is ignored, the TPM device is still present (though non-functional) 
and so breaks the subsequent suspend.

In addition, from the error code we can tell that it is not actually a selftest 
failure. INVALID_POSTINIT for a command other than TPM_Startup means that no 
TPM_Startup has been executed for that power cycle yet, so the TPM has to 
reject all other commands. Usually, the platform sends the TPM_Startup command, 
but not in your case apparently.

The correct solution should be something like tpm2_auto_startup (execute 
selftests, if they fail because of the missing startup command, execute that 
and retry the selftests). Interestingly, tpm1_auto_startup (same purpose as 
tpm2_auto_startup, but for TPM 1.2 instead) does not use the same sequence, the 
startup-retry part is missing. Is there any reason this is done differently for 
TPM 1.2? Otherwise I'd propose to make tpm1_auto_startup follow the same 
sequence as tpm2_auto_startup and then call both from tpm_tis_resume, similar 
to what tpm_chip_register does.

Alexander


RE: TPM driver breaks S3 suspend

2017-12-22 Thread Alexander.Steffen
> On Thu, Dec 21, 2017 at 6:19 PM,   wrote:
> >> Hi,
> >> We have a desktop which has S3 suspend (to RAM) problem due to
> >> error messages as follows.
> >> [  198.908282] tpm tpm0: Error (38) sending savestate before suspend
> >> [  198.908289] __pnp_bus_suspend(): tpm_pm_suspend+0x0/0x160
> returns
> >> 38
> >> [  198.908293] dpm_run_callback(): pnp_bus_suspend+0x0/0x20 returns
> 38
> >> [  198.908298] PM: Device 00:0b failed to suspend: error 38
> >>
> >> However, the first suspend after boot is working although it still
> >> shows an interesting message during resume.
> >> [  155.789945] tpm tpm0: A TPM error (38) occurred continue selftest
> >>
> >> The error code 38 in definition is TPM_ERR_INVALID_POSTINIT. I
> >> found some explanations which said this error code means that this
> >> command was received in the wrong sequence relative to a TPM_Startup
> >> command. Don't really know what happens here and how should I deal
> >> with this? Any suggestions? Please let me know what else information
> >> should I provide. Thanks
> >>
> >> Chris
> >
> > Just from looking at the code, this seems to be an issue in tpm_tis_resume.
> >
> > When the device is not a TPM 2.0, it tries to execute the selftests, but
> ignores the results. In your case the selftests fail during resume, but since
> the error is ignored, the TPM device is still present (though non-functional)
> and so breaks the subsequent suspend.
> >
> > In addition, from the error code we can tell that it is not actually a 
> > selftest
> failure. INVALID_POSTINIT for a command other than TPM_Startup means
> that no TPM_Startup has been executed for that power cycle yet, so the
> TPM has to reject all other commands. Usually, the platform sends the
> TPM_Startup command, but not in your case apparently.
> >
> > The correct solution should be something like tpm2_auto_startup (execute
> selftests, if they fail because of the missing startup command, execute that
> and retry the selftests). Interestingly, tpm1_auto_startup (same purpose as
> tpm2_auto_startup, but for TPM 1.2 instead) does not use the same
> sequence, the startup-retry part is missing. Is there any reason this is done
> differently for TPM 1.2? Otherwise I'd propose to make tpm1_auto_startup
> follow the same sequence as tpm2_auto_startup and then call both from
> tpm_tis_resume, similar to what tpm_chip_register does.
> >
> > Alexander
> 
> You mean do tpm1(or 2)_auto_startup when I fail selftest with error
> code 38? Then it should retry until the TPM state back to correct
> state?

Yes, but that will only help once we've taught tpm1_auto_startup to handle 
error code 38 similar to tpm2_auto_startup.

But you can try whether that approach solves your problem without rebuilding 
the kernel, by sending the TPM_Startup command from user space:

python3 -c 'f=open("/dev/tpm0", "r+b", buffering=0); 
f.write(b"\x00\xc1\x00\x00\x00\x0c\x00\x00\x00\x99\x00\x01"
); print(f.readall())'

Could you try the following sequence?
1. Boot your system.
2. Suspend and resume your system. 
3. Send TPM_Startup manually.
4. Go to step 2.

If my theory is correct, the TPM should no longer fail during suspend, though 
you'll still get the same error message when resuming.

Alexander


RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`

2017-12-22 Thread Alexander.Steffen
> Hi Paul,
> 
> On Mon, 2017-12-11 at 13:54 +0100, Paul Menzel wrote:
> > Dear Jason,
> >
> >
> > On 12/08/17 17:18, Jason Gunthorpe wrote:
> > > On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
> > >
> > >> I have no access to the system right now, but want to point out, that the
> > >> log was created by `journactl -k`, so I do not know if that messes with
> the
> > >> time stamps. I checked the output of `dmesg` but didn’t see the TPM
> error
> > >> messages in the output – only `tpm_tis MSFT0101:00: 2.0 TPM (device-id
> 0xFE,
> > >> rev-id 4)`. Do I need to pass a different error message to `dmesg`?
> > >
> > > It is a good question, I don't know.. If your kernel isn't setup to
> > > timestamp messages then the journalstamp will certainly be garbage.
> > >
> > > No idea why you wouldn't see the messages in dmesg, if they are not in
> > > dmesg they couldn't get into the journal
> >
> > It looks like I was running an older Linux kernel version, when running
> > `dmesg`. Sorry for the noise. Here are the messages with the Linux
> > kernel time stamps, showing that the delays work correctly.
> >
> > ```
> > $ uname -a
> > Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3
> > 17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> > $ sudo dmesg | grep TPM
> > [0.00] ACPI: TPM2 0x6F332168 34 (v03Tpm2Tabl
> > 0001 AMI  )
> > [1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> > [1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [3.734808] tpm tpm0: TPM self test failed
> > [3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
> > ```
> 
> I've sort of been following this thread, but just want to make sure
> that once the self test is/was fixed, that you aren't seeing the IMA
> message.
> 
> Assuming this is fixed, could someone provide the commit that fixes
> it?

I don't think we've found a solution yet. There might be a firmware upgrade 
that changes that TPM's behavior. Or maybe my latest patch helps? 
https://patchwork.kernel.org/patch/10130535/

Alexander


RE: Fixing CVE-2017-15361

2017-10-26 Thread Alexander.Steffen
> On Wed, Oct 25, 2017 at 07:17:17AM -0700, Matthew Garrett wrote:
> > On Wed, Oct 25, 2017 at 6:44 AM, Jarkko Sakkinen
> >  wrote:
> > > I'm implementing a fix for CVE-2017-15361 that simply blacklists
> > > vulnerable FW versions. I think this is the only responsible action from
> > > my side that I can do.
> >
> > I'm not sure this is ideal - do Infineon have any Linux tooling for
> > performing firmware updates, and if so will that continue working if
> > the device is blacklisted? It's also a poor user experience to have
> > systems using TPM-backed disk encryption keys suddenly rendered
> > unbootable, and making it as easy as possible for people to do an
> > upgrade and then re-seal secrets with new keys feels like the correct
> > approach.
> 
> I talked today with Alexander Steffen in the KS unconference and we
> concluded that this would be a terrible idea.

Right. Thinking more about this issue, I'd say the ideal way to handle it would 
be in the applications using the TPM. Not all functionalities of the TPM are 
affected, so only the applications can know whether their use cases are still 
safe and it are the applications that need to migrate to a safe solution should 
this be necessary. (Of course, this puts the burden on each individual 
application instead of having one central place to decide what is safe and what 
isn't, but I do not see any way around that.)

As far as I know, the kernel itself is not using any of the affected 
functionalities, so there is no need for an immediate mitigation within the 
kernel. But I'd like to hear about how similar issues were handled in the past. 
I can think of multiple severe security issues, for example in BIOS 
implementations, but I cannot recall ever hearing about the kernel refusing to 
boot on such machines or even do so much as print a warning about that 
vulnerability.

> Alexander stated the following things about FW updates (Alexander,
> please correct me if I state something incorrectly or if you have
> something to add):
> 
> * FW update can be constructed either in a way that the keys in the
>   NVRAM are not cleared or in a way that they are cleared.

Correct. But as far as I know, the updates that were already published for this 
issue do not delete any of the keys. And I do not think that this would be a 
good idea. After all, the applications still might need access to their key to 
decrypt their data and reencrypt it with a safe key after applying the update.

> * FW update cannot be directly applied to the TPM but must come as
>   part of the firmware update from the vendor.

Yes, starting the upgrade process is guarded by platformAuth/platformPolicy (in 
the case of TPM2), so the platform vendor needs to be involved. And you want 
them to be involved, since they need to make sure that their system still works 
with the updated TPM. I'm not sure whether platform vendors do that for TPMs, 
but for wireless cards whitelisting in the BIOS is common, and you do not want 
your machine refusing to boot just because the BIOS does not recognize your 
TPM's firmware version anymore (as a simple example).

> I proposed the following as an alternative:
> 
> * Print a message to the klog (which log level would be appropriate?).
> * Possibly sleep for few seconds. Is this a good idea?

I'd be okay with that, but ideally we'd have some kind of agreement/history of 
how to handle similar security issues in hardware components in general. 
Implementing a special case just for this TPM vulnerability does not seem like 
the right thing to do, especially when the kernel itself is not affected (and 
thus the whole machine might not be affected for the way that it is used). We 
do not want to confuse users or make them expect similar warnings in the 
future, when we might have no intention of providing them.

> While writing this email yet another alternative popped into my mind:
> what if we allow only in-kernel use but disallow the use of /dev/tpm0?
> You could still use trusted keys.
> 
> Here are all the ideas that I have and I am open for better
> alternatives.
> 
> /Jarkko

Alexander


RE: Fixing CVE-2017-15361

2017-10-26 Thread Alexander.Steffen
> On Thu, Oct 26, 2017 at 12:26:10AM +0200, Peter Huewe wrote:
> >
> >
> > Am 25. Oktober 2017 20:53:49 MESZ schrieb Jarkko Sakkinen
> :
> > >On Wed, Oct 25, 2017 at 07:17:17AM -0700, Matthew Garrett wrote:
> > >> On Wed, Oct 25, 2017 at 6:44 AM, Jarkko Sakkinen
> > >>  wrote:
> > >> > I'm implementing a fix for CVE-2017-15361 that simply blacklists
> > >> > vulnerable FW versions. I think this is the only responsible action
> > >from
> > >> > my side that I can do.
> > >>
> > >> I'm not sure this is ideal - do Infineon have any Linux tooling for
> > >> performing firmware updates, and if so will that continue working if
> > >> the device is blacklisted? It's also a poor user experience to have
> > >> systems using TPM-backed disk encryption keys suddenly rendered
> > >> unbootable, and making it as easy as possible for people to do an
> > >> upgrade and then re-seal secrets with new keys feels like the correct
> > >> approach.
> > >
> > >I talked today with Alexander Steffen in the KS unconference and we
> > >concluded that this would be a terrible idea.
> > >
> > >Alexander stated the following things about FW updates (Alexander,
> > >please correct me if I state something incorrectly or if you have
> > >something to add):
> > >
> > >* FW update can be constructed either in a way that the keys in the
> > >  NVRAM are not cleared or in a way that they are cleared.
> > >* FW update cannot be directly applied to the TPM but must come as
> > >  part of the firmware update from the vendor.
> > >
> > >I proposed the following as an alternative:
> > >
> > >* Print a message to the klog (which log level would be appropriate?).
> > Info?
> > Maybe warn, definitely not err
> 
> Since the driver does not fail usually warn would make sense but since
> here even allowing to continue to use such TPM is questionable I would
> use error here.
> 
> People anyway ignore klog too easily so using warn would be a mistake in
> my opinion. It's like saying that nothing serious is happening here,
> move along.
> 
> Do you think so?
> 
> > >* Possibly sleep for few seconds. Is this a good idea?
> > Helps how?
> 
> Obviously to get it noticed that the system integrity is broken.
> 
> > >While writing this email yet another alternative popped into my mind:
> > >what if we allow only in-kernel use but disallow the use of /dev/tpm0?
> > >You could still use trusted keys.
> > >
> > No, same terrible idea since you block the upgrade path.
> > Upgrade tools work from userspace via the kernel driver.
> > So /dev/tpm0 is necessary.
> 
> Right! How stupid of me (my previous response to Jerry) :-) Of course you
> can have special commands and talk to the TPM to do the upgrade even if
> it is part of the platform and not connected to a standard bus.
> 
> I got understanding in the yesterdays unconfernce discussion that it
> should be part of the firmware upgrade.

Yes, but it really depends on the way the vendor chooses to do the upgrade. 
UEFI Capsules would be one standard way that does not involve the Linux driver. 
But maybe you are on some embedded ARM platform without UEFI, then you can also 
run the upgrade through /dev/tpm0, so that you do not need to invent another 
way to talk to the TPM.

This second option does have the drawback of the Linux driver not being aware 
of the upgrade happening. It does not know that while the TPM is in the upgrade 
mode no other commands can be executed. Neither does it know that after the 
upgrade the system needs to be rebooted before the TPM can be used again (so 
that for example the PCRs have the correct values again). I want to look into 
those issues in the future.

> How do you do the upgrade through /dev/tpm0?
> 
> /Jarkko

Alexander


RE: Fixing CVE-2017-15361

2017-10-26 Thread Alexander.Steffen
> > On Wed, Oct 25, 2017 at 07:17:17AM -0700, Matthew Garrett wrote:
> > > On Wed, Oct 25, 2017 at 6:44 AM, Jarkko Sakkinen
> > >  wrote:
> > > > I'm implementing a fix for CVE-2017-15361 that simply blacklists
> > > > vulnerable FW versions. I think this is the only responsible action from
> > > > my side that I can do.
> > >
> > > I'm not sure this is ideal - do Infineon have any Linux tooling for
> > > performing firmware updates, and if so will that continue working if
> > > the device is blacklisted? It's also a poor user experience to have
> > > systems using TPM-backed disk encryption keys suddenly rendered
> > > unbootable, and making it as easy as possible for people to do an
> > > upgrade and then re-seal secrets with new keys feels like the correct
> > > approach.
> >
> > I talked today with Alexander Steffen in the KS unconference and we
> > concluded that this would be a terrible idea.
> 
> Right. Thinking more about this issue, I'd say the ideal way to handle it 
> would
> be in the applications using the TPM. Not all functionalities of the TPM are
> affected, so only the applications can know whether their use cases are still
> safe and it are the applications that need to migrate to a safe solution 
> should
> this be necessary. (Of course, this puts the burden on each individual
> application instead of having one central place to decide what is safe and
> what isn't, but I do not see any way around that.)
> 
> As far as I know, the kernel itself is not using any of the affected
> functionalities, so there is no need for an immediate mitigation within the
> kernel.

This does not seem to be entirely true. The code in security/keys/trusted.c 
might be affected under some circumstances, but I am not familiar enough with 
that code to say for sure. As far as I can tell, it does not create any RSA 
keys directly, but it might reference them as parent keys, so that vulnerable 
keys are used to protect the secrets. 

If that is the case, can we detect that issue and migrate the secrets to a safe 
configuration? Do we (as the kernel) want to do that? How much do we want/have 
to involve the user in the process?

> But I'd like to hear about how similar issues were handled in the past.
> I can think of multiple severe security issues, for example in BIOS
> implementations, but I cannot recall ever hearing about the kernel refusing
> to boot on such machines or even do so much as print a warning about that
> vulnerability.
> 
> > Alexander stated the following things about FW updates (Alexander,
> > please correct me if I state something incorrectly or if you have
> > something to add):
> >
> > * FW update can be constructed either in a way that the keys in the
> >   NVRAM are not cleared or in a way that they are cleared.
> 
> Correct. But as far as I know, the updates that were already published for 
> this
> issue do not delete any of the keys. And I do not think that this would be a
> good idea. After all, the applications still might need access to their key to
> decrypt their data and reencrypt it with a safe key after applying the update.
> 
> > * FW update cannot be directly applied to the TPM but must come as
> >   part of the firmware update from the vendor.
> 
> Yes, starting the upgrade process is guarded by platformAuth/platformPolicy
> (in the case of TPM2), so the platform vendor needs to be involved. And you
> want them to be involved, since they need to make sure that their system
> still works with the updated TPM. I'm not sure whether platform vendors do
> that for TPMs, but for wireless cards whitelisting in the BIOS is common, and
> you do not want your machine refusing to boot just because the BIOS does
> not recognize your TPM's firmware version anymore (as a simple example).
> 
> > I proposed the following as an alternative:
> >
> > * Print a message to the klog (which log level would be appropriate?).
> > * Possibly sleep for few seconds. Is this a good idea?
> 
> I'd be okay with that, but ideally we'd have some kind of agreement/history
> of how to handle similar security issues in hardware components in general.
> Implementing a special case just for this TPM vulnerability does not seem like
> the right thing to do, especially when the kernel itself is not affected (and
> thus the whole machine might not be affected for the way that it is used).
> We do not want to confuse users or make them expect similar warnings in
> the future, when we might have no intention of providing them.
> 
> > While writing this email yet another alternative popped into my mind:
> > what if we allow only in-kernel use but disallow the use of /dev/tpm0?
> > You could still use trusted keys.
> >
> > Here are all the ideas that I have and I am open for better
> > alternatives.
> >
> > /Jarkko
> 
> Alexander


RE: [PATCH v2] tpm-dev-common: Reject too short writes

2017-09-06 Thread Alexander.Steffen
> On Wed, Sep 06, 2017 at 03:42:33PM +0300, Jarkko Sakkinen wrote:
> > On Mon, Sep 04, 2017 at 07:36:42PM +0200, Alexander Steffen wrote:
> > > tpm_transmit() does not offer an explicit interface to indicate the
> > > number of valid bytes in the communication buffer. Instead, it
> > > relies on the commandSize field in the TPM header that is encoded within
> the buffer.
> > > Therefore, ensure that a) enough data has been written to the
> > > buffer, so that the commandSize field is present and b) the
> > > commandSize field does not announce more data than has been written
> to the buffer.
> > >
> > > This should have been fixed with CVE-2011-1161 long ago, but
> > > apparently a correct version of that patch never made it into the kernel.
> > >
> > > Cc: sta...@vger.kernel.org
> > > Signed-off-by: Alexander Steffen 
> > > ---
> > > v2:
> > > - Moved all changes to tpm_common_write in a single patch.
> > >
> > >  drivers/char/tpm/tpm-dev-common.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/char/tpm/tpm-dev-common.c
> > > b/drivers/char/tpm/tpm-dev-common.c
> > > index 610638a..ac25574 100644
> > > --- a/drivers/char/tpm/tpm-dev-common.c
> > > +++ b/drivers/char/tpm/tpm-dev-common.c
> > > @@ -99,7 +99,8 @@ ssize_t tpm_common_write(struct file *file, const
> char __user *buf,
> > >   if (atomic_read(>data_pending) != 0)
> > >   return -EBUSY;
> > >
> > > - if (in_size > TPM_BUFSIZE)
> > > + if (in_size > sizeof(priv->data_buffer) || in_size < 6 ||
> > > + in_size < be32_to_cpu(*((__be32 *) (buf + 2
> > >   return -E2BIG;
> > >
> > >   mutex_lock(>buffer_mutex);
> > > --
> > > 2.7.4
> > >
> >
> > How did you test this change after you implemented it?
> >
> > Just thinking what to add to
> > https://github.com/jsakkine-intel/tpm2-scripts

I already had test cases that failed with some of my TPMs under some 
circumstances. I'll try to come up with a concise description of what those 
tests do or send you a patch directly for your tests. GitHub pull requests are 
okay for that repository? (I already have one waiting there.)

> >
> > /Jarkko
> 
> Just when I started to implement this that the bug fix itself does not have 
> yet
> the right semantics.
> 
> It should be just add a new check:
> 
> if (in_size != be32_to_cpu(*((__be32 *) (buf + 2
>   return -EINVAL;
> 
> The existing check is correct. This was missing. The reason for this is that 
> we
> process whatever is in the in_size bytes as a full command.

There are two problems with this solution:

1. You need to check for in_size < 6, otherwise you read data that has not been 
written there during that request. I haven't tested this, but I'd expect it to 
fail for example if you try to send the two commands "00 00 00 00 00 02" and 
"00 00". The first will be rejected with EINVAL, because 6 (in_size) != 2 
(commandSize). But the second will pass that check, because now in_size happens 
to match the commandSize that has only been written to the buffer for the first 
command.

2. You may not reject commands where in_size > commandSize, because TIS/PTP 
require the TPM to throw away extra bytes (and process the command as usual), 
not fail the command. You can see that in the State Transition Table (Table 18 
in TIS 1.3), line 20, with the TPM in Reception state and Expect=0, writing 
more data does not change the state ("Write is not expected. Drop write. TPM 
ignores this state transition."). Of course, since we do not pass on in_size, 
but only commandSize the TPM will never see those extra bytes, but the external 
behavior (for user space applications) is identical.

When you fix those two problems, you're probably back at my solution. The only 
other change I made (renaming TPM_BUFSIZE to sizeof(priv->data_buffer)) does 
not change anything, the values are identical. I just find it very confusing 
when you compare something against a magic constant to avoid buffer overflows 
in your memcpy. Using the buffer size directly is much more straightforward 
(and less prone to fail as soon as someone changes the structure definition).

> 
> Sorry I didn't notice before I started to implement a test case.
> 
> /Jarkko

Alexander


RE: [PATCH v2] tpm-dev-common: Reject too short writes

2017-09-08 Thread Alexander.Steffen
> On Wed, Sep 06, 2017 at 02:19:28PM +,
> alexander.stef...@infineon.com wrote:
> > > On Wed, Sep 06, 2017 at 03:42:33PM +0300, Jarkko Sakkinen wrote:
> > > > On Mon, Sep 04, 2017 at 07:36:42PM +0200, Alexander Steffen wrote:
> > > > > tpm_transmit() does not offer an explicit interface to indicate the
> > > > > number of valid bytes in the communication buffer. Instead, it
> > > > > relies on the commandSize field in the TPM header that is encoded
> within
> > > the buffer.
> > > > > Therefore, ensure that a) enough data has been written to the
> > > > > buffer, so that the commandSize field is present and b) the
> > > > > commandSize field does not announce more data than has been
> written
> > > to the buffer.
> > > > >
> > > > > This should have been fixed with CVE-2011-1161 long ago, but
> > > > > apparently a correct version of that patch never made it into the
> kernel.
> > > > >
> > > > > Cc: sta...@vger.kernel.org
> > > > > Signed-off-by: Alexander Steffen
> 
> > > > > ---
> > > > > v2:
> > > > > - Moved all changes to tpm_common_write in a single patch.
> > > > >
> > > > >  drivers/char/tpm/tpm-dev-common.c | 3 ++-
> > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/char/tpm/tpm-dev-common.c
> > > > > b/drivers/char/tpm/tpm-dev-common.c
> > > > > index 610638a..ac25574 100644
> > > > > --- a/drivers/char/tpm/tpm-dev-common.c
> > > > > +++ b/drivers/char/tpm/tpm-dev-common.c
> > > > > @@ -99,7 +99,8 @@ ssize_t tpm_common_write(struct file *file,
> const
> > > char __user *buf,
> > > > >   if (atomic_read(>data_pending) != 0)
> > > > >   return -EBUSY;
> > > > >
> > > > > - if (in_size > TPM_BUFSIZE)
> > > > > + if (in_size > sizeof(priv->data_buffer) || in_size < 6 ||
> > > > > + in_size < be32_to_cpu(*((__be32 *) (buf + 2
> > > > >   return -E2BIG;
> > > > >
> > > > >   mutex_lock(>buffer_mutex);
> > > > > --
> > > > > 2.7.4
> > > > >
> > > >
> > > > How did you test this change after you implemented it?
> > > >
> > > > Just thinking what to add to
> > > > https://github.com/jsakkine-intel/tpm2-scripts
> >
> > I already had test cases that failed with some of my TPMs under some
> circumstances. I'll try to come up with a concise description of what those
> tests do or send you a patch directly for your tests. GitHub pull requests are
> okay for that repository? (I already have one waiting there.)
> >
> > > >
> > > > /Jarkko
> > >
> > > Just when I started to implement this that the bug fix itself does not 
> > > have
> yet
> > > the right semantics.
> > >
> > > It should be just add a new check:
> > >
> > > if (in_size != be32_to_cpu(*((__be32 *) (buf + 2
> > >   return -EINVAL;
> > >
> > > The existing check is correct. This was missing. The reason for this is 
> > > that
> we
> > > process whatever is in the in_size bytes as a full command.
> >
> > There are two problems with this solution:
> >
> > 1. You need to check for in_size < 6, otherwise you read data that has
> > not been written there during that request. I haven't tested this, but
> > I'd expect it to fail for example if you try to send the two commands
> > "00 00 00 00 00 02" and "00 00". The first will be rejected with
> > EINVAL, because 6 (in_size) != 2 (commandSize). But the second will
> > pass that check, because now in_size happens to match the commandSize
> > that has only been written to the buffer for the first command.
> 
> AFAIK tpm_transmit checks that the command has at least the header.

This was only a simple example, it will fail with other values as well. Just 
add 8 to both size fields and append 8 null bytes, and you will pass the length 
check in tpm_transmit but still have incorrect data. Also, it is probably no 
good style to omit checks for obvious errors, just because an unrelated check 
in a completely different location also happens to catch the problem under some 
circumstances. tpm_common_write discards the relevant information (in_size), so 
all other parts of the code need to be able to rely on it to have validated it 
properly.

> > 2. You may not reject commands where in_size > commandSize, because
> > TIS/PTP require the TPM to throw away extra bytes (and process the
> > command as usual), not fail the command. You can see that in the State
> > Transition Table (Table 18 in TIS 1.3), line 20, with the TPM in
> > Reception state and Expect=0, writing more data does not change the
> > state ("Write is not expected. Drop write. TPM ignores this state
> > transition."). Of course, since we do not pass on in_size, but only
> > commandSize the TPM will never see those extra bytes, but the external
> > behavior (for user space applications) is identical.
> 
> OK, this is more relevant. What is the legit case to send extra bytes?

For me: testing that my TPM implementation behaves correctly :) I can run the 
same test cases against the TPM using the Linux driver and several other, 
unrelated means. 

RE: [PATCH v4 2/4] tpm: ignore burstcount to improve tpm_tis send() performance

2017-11-21 Thread Alexander.Steffen
> > > On 10/20/2017 08:12 PM, alexander.stef...@infineon.com wrote:
> > > >> The TPM burstcount status indicates the number of bytes that can
> > > >> be sent to the TPM without causing bus wait states.  Effectively,
> > > >> it is the number of empty bytes in the command FIFO.
> > > >>
> > > >> This patch optimizes the tpm_tis_send_data() function by checking
> > > >> the burstcount only once. And if the burstcount is valid, it writes
> > > >> all the bytes at once, permitting wait state.
> > > >>
> > > >> After this change, performance on a TPM 1.2 with an 8 byte
> > > >> burstcount for 1000 extends improved from ~41sec to ~14sec.
> > > >>
> > > >> Suggested-by: Ken Goldman  in
> > > >> conjunction with the TPM Device Driver work group.
> > > >> Signed-off-by: Nayna Jain
> > > >> Acked-by: Mimi Zohar
> > > >> ---
> > > >>   drivers/char/tpm/tpm_tis_core.c | 42 +++--
> --
> > --
> > > 
> > > >> 
> > > >>   1 file changed, 15 insertions(+), 27 deletions(-)
> > > >>
> > > >> diff --git a/drivers/char/tpm/tpm_tis_core.c
> > > >> b/drivers/char/tpm/tpm_tis_core.c
> > > >> index b33126a35694..993328ae988c 100644
> > > >> --- a/drivers/char/tpm/tpm_tis_core.c
> > > >> +++ b/drivers/char/tpm/tpm_tis_core.c
> > > >> @@ -316,7 +316,6 @@ static int tpm_tis_send_data(struct tpm_chip
> > > *chip,
> > > >> u8 *buf, size_t len)
> > > >>   {
> > > >>struct tpm_tis_data *priv = dev_get_drvdata(>dev);
> > > >>int rc, status, burstcnt;
> > > >> -  size_t count = 0;
> > > >>bool itpm = priv->flags & TPM_TIS_ITPM_WORKAROUND;
> > > >>
> > > >>status = tpm_tis_status(chip);
> > > >> @@ -330,35 +329,24 @@ static int tpm_tis_send_data(struct
> tpm_chip
> > > *chip,
> > > >> u8 *buf, size_t len)
> > > >>}
> > > >>}
> > > >>
> > > >> -  while (count < len - 1) {
> > > >> -  burstcnt = get_burstcount(chip);
> > > >> -  if (burstcnt < 0) {
> > > >> -  dev_err(>dev, "Unable to read
> burstcount\n");
> > > >> -  rc = burstcnt;
> > > >> -  goto out_err;
> > > >> -  }
> > > >> -  burstcnt = min_t(int, burstcnt, len - count - 1);
> > > >> -  rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv-
> > > >>> locality),
> > > >> -   burstcnt, buf + count);
> > > >> -  if (rc < 0)
> > > >> -  goto out_err;
> > > >> -
> > > >> -  count += burstcnt;
> > > >> -
> > > >> -  if (wait_for_tpm_stat(chip, TPM_STS_VALID, chip-
> > > >>> timeout_c,
> > > >> -  >int_queue, false) < 0) {
> > > >> -  rc = -ETIME;
> > > >> -  goto out_err;
> > > >> -  }
> > > >> -  status = tpm_tis_status(chip);
> > > >> -  if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0)
> {
> > > >> -  rc = -EIO;
> > > >> -  goto out_err;
> > > >> -  }
> > > >> +  /*
> > > >> +   * Get the initial burstcount to ensure TPM is ready to
> > > >> +   * accept data.
> > > >> +   */
> > > >> +  burstcnt = get_burstcount(chip);
> > > >> +  if (burstcnt < 0) {
> > > >> +  dev_err(>dev, "Unable to read burstcount\n");
> > > >> +  rc = burstcnt;
> > > >> +  goto out_err;
> > > >>}
> > > >>
> > > >> +  rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv-
> >locality),
> > > >> +  len - 1, buf);
> > > >> +  if (rc < 0)
> > > >> +  goto out_err;
> > > >> +
> > > >>/* write last byte */
> > > >> -  rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality),
> > > >> buf[count]);
> > > >> +  rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality),
> buf[len-
> > > >> 1]);
> > > >>if (rc < 0)
> > > >>goto out_err;
> > > >>
> > > >> --
> > > >> 2.13.3
> > > > This seems to fail reliably with my SPI TPM 2.0. I get EIO when trying 
> > > > to
> > > send large amounts of data, e.g. with TPM2_Hash, and subsequent tests
> > > seem to take an unusual amount of time. More analysis probably has to
> > wait
> > > until November, since I am going to be in Prague next week.
> > >
> > > Thanks Alex for testing these.. Did you get the chance to do any further
> > > analysis ?
> >
> > I am working on that now. Ken's suggestion seems reasonable, so I am
> going
> > to test whether correctly waiting for the flags to change fixes the problem.
> If
> > it does, I'll send the patches.
> 
> Sorry for the delay, I had to take care of some device tree changes in v4.14
> that broke my ARM test machines.
> 
> I've implemented some patches that fix the issue that Ken pointed out and
> rebased your patch 2/4 ("ignore burstcount") on top. While doing this I
> noticed that your original patch does not, as the commit message says, 

RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`

2017-12-07 Thread Alexander.Steffen
> Dear Linux folks,
> 
> 
> With Linux 4.15-rc2 built by the Ubuntu Kernel Team, the error messages
> below are shown by the Linux kernel. These are new.
> 
> ```
> Dez 06 13:22:24 Ixpees kernel: microcode: microcode updated early to
> revision 0x62, date = 2017-04-27
> Dez 06 13:22:24 Ixpees kernel: Linux version 4.15.0-041500rc2-generic
> (kernel@gloin) (gcc version 7.2.0 (Ubuntu 7.2.0-8ubuntu3)) #201712031230
> SMP Sun Dec 3 17:32:03 UTC 2017
> Dez 06 13:22:24 Ixpees kernel: Command line:
> BOOT_IMAGE=/vmlinuz-4.15.0-041500rc2-generic
> root=/dev/mapper/ubuntu--vg-root ro quiet splash vt.handoff=7
> Dez 06 13:22:24 Ixpees kernel: KERNEL supported cpus:
> Dez 06 13:22:24 Ixpees kernel:   Intel GenuineIntel
> Dez 06 13:22:24 Ixpees kernel:   AMD AuthenticAMD
> Dez 06 13:22:24 Ixpees kernel:   Centaur CentaurHauls
> Dez 06 13:22:24 Ixpees kernel: x86/fpu: Supporting XSAVE feature 0x001:
> 'x87 floating point registers'
> Dez 06 13:22:24 Ixpees kernel: x86/fpu: Supporting XSAVE feature 0x002:
> 'SSE registers'
> […]
> Dez 06 13:22:24 Ixpees kernel: tpm_tis MSFT0101:00: 2.0 TPM (device-id
> 0xFE, rev-id 4)
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: TPM self test failed
> ```
> 
> Any idea what to do about those?

The list of "A TPM error (2314) occurred continue selftest" is caused by my 
commit 125a2210541079e8e7c69e629ad06cabed788f8c ("tpm: React correctly to 
RC_TESTING from TPM 2.0 self tests"). 2314 is TPM_RC_TESTING, so the TPM tells 
us that self-tests are still running in the background. This problem was not 
visible in previous versions, since it (incorrectly) ignored TPM_RC_TESTING.

The final error "TPM self test failed" is then the driver giving up after too 
many TPM_RC_TESTING responses have been received.

What confuses me a bit is that according to your log all of that happens within 
the same second. The driver uses tpm2_calc_ordinal_duration for 
TPM2_CC_SELF_TEST which maps to TPM_LONG and finally to TPM2_DURATION_LONG = 
2000ms = 2s. So you should see those retries for at least two seconds. But 
since it does not give the TPM enough time to execute the tests, you see it 
failing in the end.

Could you try to find out how much time exactly (in milliseconds) it takes from 
the first "A TPM error" to the final "TPM self test failed" message? Is it 
possible that tpm_msleep delays for significantly less time in this case?

Also, while looking at the code I've noticed that the retry loop is not written 
in the best possible way and should actually try one more time. Could you make 
the following change to tpm2_do_selftest in drivers/char/tpm/tpm2-cmd.c and see 
whether that helps? 

---
duration = jiffies_to_msecs(
tpm2_calc_ordinal_duration(chip, TPM2_CC_SELF_TEST));
 
-   while (duration > 0) {
+   while (1) {
cmd.header.in = tpm2_selftest_header;
cmd.params.selftest_in.full_test = 0;
 
rc = tpm_transmit_cmd(chip, NULL, ,
TPM2_SELF_TEST_IN_SIZE,
  0, 0, "continue selftest");
 
-   if (rc != TPM2_RC_TESTING)
+   if (rc != TPM2_RC_TESTING || duration <= 0)
break;
 
tpm_msleep(delay_msec);
---

> If these are intentional, it’d be great
> to give some hint to the user, what effect this has.

I agree that those error messages in their current form are not that helpful 
for the users. But they are part of the general driver architecture, and are 
also caused by other parts of the code (e.g. when using a TPM 1.2 that is 
deactivated or when the platform did not send a startup command). We should 
find a way to hide (or at least mark) those kind of expected errors.

Alexander


RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`

2017-12-08 Thread Alexander.Steffen
> On Thu, Dec 07, 2017 at 03:56:07PM +, alexander.stef...@infineon.com
> wrote:
> 
> > > If these are intentional, it’d be great
> > > to give some hint to the user, what effect this has.
> >
> > I agree that those error messages in their current form are not that
> > helpful for the users. But they are part of the general driver
> > architecture, and are also caused by other parts of the code
> > (e.g. when using a TPM 1.2 that is deactivated or when the platform
> > did not send a startup command). We should find a way to hide (or at
> > least mark) those kind of expected errors.
> 
> Other parts of the TPM code did supresses 'expected' errors like this,
> I'm not sure if it got removed during Jarkko's cleanup though - we
> need to put stuff like that back, it should not printk for something
> like this.

Yes, I've got this task here somewhere, just no time to do it...

> For this, if we are waiting then it should compute an absolute time
> after which it will give up.
> 
> Code it like this instead and get rid of the ugly 'duration' scheme.
> 
> ktime_t stop = ktime_add_ns(ktime_get(), [timeout in ns]);
> do
> {
> }
> while (ktime_before(ktime_get(), stop);

Is it really that ugly? I still need delay_msec to increase the delay each 
round. I can see the benefit of your suggestion when it is important to get the 
timing exactly right (and also account for time spent elsewhere, when our code 
might not be executing). But in this case having delays that are approximately 
right (or longer than intended) is sufficient.

Anyway, from the log messages it is clear that tpm_msleep got called seven 
times with delays of 20/40/80/160/320/640/1280ms. But still all timestamps lie 
within the same second. How can this be with a cumulated delay of ~2.5s?

Also, I've just noticed that despite the name tpm_msleep calls usleep_range, 
not msleep. Can this have an influence? Should tpm_msleep call msleep for 
longer delays, as suggested by Documentation/timers/timers-howto.txt?

Alexander


RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`

2017-12-11 Thread Alexander.Steffen
> Dear Jason,
> 
> 
> On 12/08/17 17:18, Jason Gunthorpe wrote:
> > On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
> >
> >> I have no access to the system right now, but want to point out, that the
> >> log was created by `journactl -k`, so I do not know if that messes with the
> >> time stamps. I checked the output of `dmesg` but didn’t see the TPM
> error
> >> messages in the output – only `tpm_tis MSFT0101:00: 2.0 TPM (device-id
> 0xFE,
> >> rev-id 4)`. Do I need to pass a different error message to `dmesg`?
> >
> > It is a good question, I don't know.. If your kernel isn't setup to
> > timestamp messages then the journalstamp will certainly be garbage.
> >
> > No idea why you wouldn't see the messages in dmesg, if they are not in
> > dmesg they couldn't get into the journal
> 
> It looks like I was running an older Linux kernel version, when running
> `dmesg`. Sorry for the noise. Here are the messages with the Linux
> kernel time stamps, showing that the delays work correctly.
> 
> ```
> $ uname -a
> Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3
> 17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> $ sudo dmesg | grep TPM
> [0.00] ACPI: TPM2 0x6F332168 34 (v03Tpm2Tabl
> 0001 AMI  )
> [1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> [1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
> [1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
> [1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
> [1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
> [1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
> [1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
> [2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
> [3.734808] tpm tpm0: TPM self test failed
> [3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
> ```

Thanks for the fixed log. So your TPM seems to be rather slow with executing 
the selftests. Could try to apply the patch that I've just sent you? It ensures 
that your TPM gets more time to execute all the tests, up to the limit set in 
the PTP.

(I would have sent that patch also to linux-kernel@vger.kernel.org, since it 
was included in the discussion, but for some strange reason my mail system 
declared that to be an "Invalid address"...)

Alexander


RE: [PATCH v4 2/4] tpm: ignore burstcount to improve tpm_tis send() performance

2017-11-08 Thread Alexander.Steffen
> On 10/20/2017 08:12 PM, alexander.stef...@infineon.com wrote:
> >> The TPM burstcount status indicates the number of bytes that can
> >> be sent to the TPM without causing bus wait states.  Effectively,
> >> it is the number of empty bytes in the command FIFO.
> >>
> >> This patch optimizes the tpm_tis_send_data() function by checking
> >> the burstcount only once. And if the burstcount is valid, it writes
> >> all the bytes at once, permitting wait state.
> >>
> >> After this change, performance on a TPM 1.2 with an 8 byte
> >> burstcount for 1000 extends improved from ~41sec to ~14sec.
> >>
> >> Suggested-by: Ken Goldman  in
> >> conjunction with the TPM Device Driver work group.
> >> Signed-off-by: Nayna Jain
> >> Acked-by: Mimi Zohar
> >> ---
> >>   drivers/char/tpm/tpm_tis_core.c | 42 +++--
> 
> >> 
> >>   1 file changed, 15 insertions(+), 27 deletions(-)
> >>
> >> diff --git a/drivers/char/tpm/tpm_tis_core.c
> >> b/drivers/char/tpm/tpm_tis_core.c
> >> index b33126a35694..993328ae988c 100644
> >> --- a/drivers/char/tpm/tpm_tis_core.c
> >> +++ b/drivers/char/tpm/tpm_tis_core.c
> >> @@ -316,7 +316,6 @@ static int tpm_tis_send_data(struct tpm_chip
> *chip,
> >> u8 *buf, size_t len)
> >>   {
> >>struct tpm_tis_data *priv = dev_get_drvdata(>dev);
> >>int rc, status, burstcnt;
> >> -  size_t count = 0;
> >>bool itpm = priv->flags & TPM_TIS_ITPM_WORKAROUND;
> >>
> >>status = tpm_tis_status(chip);
> >> @@ -330,35 +329,24 @@ static int tpm_tis_send_data(struct tpm_chip
> *chip,
> >> u8 *buf, size_t len)
> >>}
> >>}
> >>
> >> -  while (count < len - 1) {
> >> -  burstcnt = get_burstcount(chip);
> >> -  if (burstcnt < 0) {
> >> -  dev_err(>dev, "Unable to read burstcount\n");
> >> -  rc = burstcnt;
> >> -  goto out_err;
> >> -  }
> >> -  burstcnt = min_t(int, burstcnt, len - count - 1);
> >> -  rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv-
> >>> locality),
> >> -   burstcnt, buf + count);
> >> -  if (rc < 0)
> >> -  goto out_err;
> >> -
> >> -  count += burstcnt;
> >> -
> >> -  if (wait_for_tpm_stat(chip, TPM_STS_VALID, chip-
> >>> timeout_c,
> >> -  >int_queue, false) < 0) {
> >> -  rc = -ETIME;
> >> -  goto out_err;
> >> -  }
> >> -  status = tpm_tis_status(chip);
> >> -  if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0) {
> >> -  rc = -EIO;
> >> -  goto out_err;
> >> -  }
> >> +  /*
> >> +   * Get the initial burstcount to ensure TPM is ready to
> >> +   * accept data.
> >> +   */
> >> +  burstcnt = get_burstcount(chip);
> >> +  if (burstcnt < 0) {
> >> +  dev_err(>dev, "Unable to read burstcount\n");
> >> +  rc = burstcnt;
> >> +  goto out_err;
> >>}
> >>
> >> +  rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv->locality),
> >> +  len - 1, buf);
> >> +  if (rc < 0)
> >> +  goto out_err;
> >> +
> >>/* write last byte */
> >> -  rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality),
> >> buf[count]);
> >> +  rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality), buf[len-
> >> 1]);
> >>if (rc < 0)
> >>goto out_err;
> >>
> >> --
> >> 2.13.3
> > This seems to fail reliably with my SPI TPM 2.0. I get EIO when trying to
> send large amounts of data, e.g. with TPM2_Hash, and subsequent tests
> seem to take an unusual amount of time. More analysis probably has to wait
> until November, since I am going to be in Prague next week.
> 
> Thanks Alex for testing these.. Did you get the chance to do any further
> analysis ?

I am working on that now. Ken's suggestion seems reasonable, so I am going to 
test whether correctly waiting for the flags to change fixes the problem. If it 
does, I'll send the patches.

Alexander


RE: [PATCH v4 2/4] tpm: ignore burstcount to improve tpm_tis send() performance

2017-11-23 Thread Alexander.Steffen
> On Wed, Nov 22, 2017 at 06:52:03AM +,
> alexander.stef...@infineon.com wrote:
> > > > > > This seems to fail reliably with my SPI TPM 2.0. I get EIO when 
> > > > > > trying
> to
> > > > > send large amounts of data, e.g. with TPM2_Hash, and subsequent
> tests
> > > > > seem to take an unusual amount of time. More analysis probably has
> to
> > > > wait
> > > > > until November, since I am going to be in Prague next week.
> > > > >
> > > > > Thanks Alex for testing these.. Did you get the chance to do any
> further
> > > > > analysis ?
> > > >
> > > > I am working on that now. Ken's suggestion seems reasonable, so I am
> > > going
> > > > to test whether correctly waiting for the flags to change fixes the
> problem.
> > > If
> > > > it does, I'll send the patches.
> > >
> > > Sorry for the delay, I had to take care of some device tree changes in
> v4.14
> > > that broke my ARM test machines.
> > >
> > > I've implemented some patches that fix the issue that Ken pointed out
> and
> > > rebased your patch 2/4 ("ignore burstcount") on top. While doing this I
> > > noticed that your original patch does not, as the commit message says,
> write
> > > all the bytes at once, but still unnecessarily splits all commands into at
> least
> > > two transfers (as did the original code). I've fixed this as well in my
> patches,
> > > so that all bytes are indeed sent in a single call, without special 
> > > handling
> for
> > > the last byte. This should speed up things further, especially for small
> > > commands and drivers like tpm_tis_spi, where writing a single byte
> > > translates into additional SPI transfers.
> 
> Thanks Alex, for digging into.
> 
> Yeah, you are right, the first version of this patch sent all the bytes 
> together,
> but after hearing ddwg inputs,
> i.e. "The last byte was introduced for error checking purposes (history).", I
> reverted back to original to be safe.
> 
> It seems that the last byte was sent from the beginning (27084ef [PATCH]
> tpm: driver for next generation TPM chips,),
> does anyone remember the reason ?

The intention seems to be to make extra sure that the TPM has correctly 
understood the command by observing the Expect flag flipping from 1 to 0 when 
writing the last byte.

But following Ken's arguments, this does not work as intended, because the 
Expect flag will change not when writing the last byte to the FIFO, but when 
the TPM reads the last byte from the FIFO. Since there is no "FIFO empty" 
indication, just observing the Expect flag to be 1 before writing the last 
byte, cannot reliably tell us anything (there might be enough data left in the 
FIFO for the Expect flag to flip to 0 without writing the last byte).

Also, I'd argue that this check is not necessary, because if the Expect flag is 
0 after all bytes have been written to the FIFO, then the TPM has correctly 
received the command and is ready to execute it. According to TIS/PTP the TPM 
is required to throw away all extra bytes that were not announced in the 
header, and in addition the kernel driver already ensures not to send more 
data. That are enough safeguards, I'd say.

> 
> > >
> > > Unfortunately, even with those changes the problem persists. But I've
> got
> > > more detailed logs now and will try to understand and hopefully fix the
> issue.
> > > I'll follow up with more details and/or patches once I know more.
> >
> > Okay, so the problem seems to be that at some point the TPM starts
> inserting wait states for the FIFO access. The driver tries to handle this, 
> but
> fails since even the 50 retries that are currently used do not seem to be
> enough. Adding small (millisecond) delays between the attempts did not
> help so far.
> >
> > Is there any limit in the specification for how many wait states the TPM may
> generate or for how long it may do so? I could not find anything, but we need
> to use something there to prevent a faulty TPM from blocking the kernel
> forever.
> >
> 
> I have been thinking on this, so was wondering:
> 
> 1. As you said the problem started while sending large amounts of data for
> TPM2_Hash, how large is "large" ? I mean did it work for some specific large
> values before failing.

Around 1k of data (the exact values are chosen randomly, and it failed many 
times), but I did not try to find a specific boundary. The interesting thing 
was that for this long command all SPI frames with the maximum payload of 64 
bytes were accepted without wait states, but the last frame (with less than 64 
bytes) caused the wait states.

> 2. Are these wait states limited to SPI, or does it happen on LPC as well?

I do not know for LPC because there the wait states are handled in hardware and 
I cannot trace the LPC signals.

> Thanks & Regards,
>- Nayna
> 
> 
> > Alexander
> >
> 


RE: [PATCH v3] tpm-dev-common: Reject too short writes

2017-09-11 Thread Alexander.Steffen
> On Sat, Sep 09, 2017 at 12:37:39AM +0300, Jarkko Sakkinen wrote:
> > On Fri, Sep 08, 2017 at 05:21:32PM +0200, Alexander Steffen wrote:
> > > tpm_transmit() does not offer an explicit interface to indicate the
> number
> > > of valid bytes in the communication buffer. Instead, it relies on the
> > > commandSize field in the TPM header that is encoded within the buffer.
> > > Therefore, ensure that a) enough data has been written to the buffer, so
> > > that the commandSize field is present and b) the commandSize field does
> not
> > > announce more data than has been written to the buffer.
> > >
> > > This should have been fixed with CVE-2011-1161 long ago, but apparently
> > > a correct version of that patch never made it into the kernel.
> > >
> > > Cc: sta...@vger.kernel.org
> > > Signed-off-by: Alexander Steffen 
> > > ---
> > > v2:
> > > - Moved all changes to tpm_common_write in a single patch.
> > > v3:
> > > - Access data copied from user space (priv->data_buffer) instead of user
> > >   space data directly (buf).
> > > - Changed return code to EINVAL.
> > >
> > >  drivers/char/tpm/tpm-dev-common.c | 6 ++
> > >  1 file changed, 6 insertions(+)
> > >
> > > diff --git a/drivers/char/tpm/tpm-dev-common.c
> b/drivers/char/tpm/tpm-dev-common.c
> > > index 610638a..461bf0b 100644
> > > --- a/drivers/char/tpm/tpm-dev-common.c
> > > +++ b/drivers/char/tpm/tpm-dev-common.c
> > > @@ -110,6 +110,12 @@ ssize_t tpm_common_write(struct file *file,
> const char __user *buf,
> > >   return -EFAULT;
> > >   }
> > >
> > > + if (in_size < 6 ||
> > > + in_size < be32_to_cpu(*((__be32 *) (priv->data_buffer + 2 {
> > > + mutex_unlock(>buffer_mutex);
> > > + return -EINVAL;
> > > + }
> > > +
> > >   /* atomic tpm command send and result receive. We only hold the
> ops
> > >* lock during this period so that the tpm can be unregistered even if
> > >* the char dev is held open.
> > > --
> > > 2.7.4
> > >
> >
> > I'm not gonna fight about that "in_size < 6" check. I think it is not
> > needed, I understand your point but still disagree but it is something
> > where I can live with having it.
> >
> > I kind of disagree also with allowing messages longer than the command
> > size but it does not have to be in the scope of this commit and actually
> > should be a separate discussion if we ever going to do something about
> > it.
> >
> > Thanks for the patience!
> >
> > Reviewed-by: Jarkko Sakkinen 
> >
> > /Jarkko
> 
> Without your fix:
> 
> $ python -m unittest -v tpm2_smoke.SmokeTest.test_too_short_cmd
> test_too_short_cmd (tpm2_smoke.SmokeTest) ... FAIL
> 
> ==
> 
> FAIL: test_too_short_cmd (tpm2_smoke.SmokeTest)
> --
> Traceback (most recent call last):
>   File "tpm2_smoke.py", line 157, in test_too_short_cmd
> self.assertEqual(rejected, True)
> AssertionError: False != True
> 
> --
> Ran 1 test in 2.108s
> 
> FAILED (failures=1)
> 
> The test case expects to get a posix error, which it doesn't get.
> 
> With your fix:
> 
> $ python -m unittest -v tpm2_smoke.SmokeTest.test_too_short_cmd
> test_too_short_cmd (tpm2_smoke.SmokeTest) ... ok
> 
> --
> Ran 1 test in 2.099s
> 
> OK
> 
> So looks good to me.
> 
> Tested-by: Jarkko Sakkinen 
> 
> Can you test the master branch with SPI TPM? I had to tinker your
> commits a bit because of merge conflicts with Arnd's commit. I'll
> put everything back to next as soon as I hear from you. Thanks
> 
> /Jarkko

tpm_tis_spi in master (3897f7c) is broken, it does not transfer any data. I'll 
send you a fixed patch for the DMA change.

Alexander


RE: [tpmdd-devel] [PATCH v2 0/4] additional TPM performance improvements

2017-09-11 Thread Alexander.Steffen
> After further discussions with the Device Driver working group (ddwg),
> the following changes were made:
> 
> * Check for burstcount at least once to confirm the TPM is ready to accept
> the data. Similarly, query for the TPM Expect status as sanity check at
> the end.
> 
> * Make the sleep for status check during send() in the loop less than
> 5msec.
> 
> * Make the sleep in the loop while querying for burstcount less than
> 5msec.
> 
> Below is the list of patches along with the performance improvements
> seen with a TPM 1.2 with an 8 byte burstcount for 1000 extends:
> 
> Patch|Improvement(time in sec)
> 
> tpm: ignore burstcount to improve tpm_tis| ~41 - ~14
> send() performance.
> 
> tpm: define __wait_for_tpm_stat to specify   | ~14 - ~10
> variable polling sleep time
> 
> tpm: reduce tpm_msleep() time in | ~10 - ~9
> get_burstcount()
> 
> tpm: modify tpm_msleep() function to have| ~9 - ~8
> max range
> 
> Changelog v2:
> 
> * Add module parameter to handle ignoring of burst count during
> tpm tis send() operation.
> * Add improvements over sleep time to reduce delays.
> 
> Nayna Jain (4):
>   tpm: ignore burstcount to improve tpm_tis send() performance.
>   tpm: define __wait_for_tpm_stat to specify variable polling sleep time
>   tpm: reduce tpm_msleep() time in get_burstcount()
>   tpm: use tpm_msleep() value as max delay
> 
>  Documentation/admin-guide/kernel-parameters.txt |  8 ++
>  drivers/char/tpm/tpm-interface.c| 15 --
>  drivers/char/tpm/tpm.h  |  7 +++--
>  drivers/char/tpm/tpm_tis_core.c | 37 
> +++--
>  4 files changed, 53 insertions(+), 14 deletions(-)
> 
> --
> 2.13.3

I haven't looked at the changes in detail yet, but some initial comments:

The ignore_burst_count functionality has already received some negative 
feedback and probably needs more iterations, if it can be accepted at all, so 
you might want to split it off from the rest of the series.

Also, did you explore any alternative ways to activate that functionality 
besides a command line parameter? If it is off by default, then most users will 
never see the benefits, and if they do activate it manually, then they might 
hit some obscure bugs because those code paths receive only little testing. I 
could imagine for example activating this functionality automatically for 
certain safe drivers like tpm_tis_spi, where wait states cannot block us (or 
anyone else on the bus) indefinitely.

I ran all your patches through my tests (though without making any changes, so 
ignore_burst_count was off, since that is the default) and did not see any 
failures. My test bench does not contain performance tests (yet), so I cannot 
confirm your measurements. checkpatch.pl has a couple of complaints though.

Alexander


RE: [PATCH 3/4] char/tpm: Improve a size determination in nine functions

2017-10-19 Thread Alexander.Steffen
> On Tue, Oct 17, 2017 at 11:50:05AM +, alexander.stef...@infineon.com
> wrote:
> > > > Replace the specification of data structures by pointer dereferences
> > > > as the parameter for the operator "sizeof" to make the corresponding
> > > > size
> > > > determination a bit safer according to the Linux coding style
> > > > convention.
> > >
> > >
> > > This patch does one style in favor of the other.
> >
> > I actually prefer that style, so I'd welcome this change :)
> >
> > > At the end it's Jarkko's call, though I would NAK this as I think some
> > > one already told this to you for some other similar patch(es).
> > >
> > >
> > > I even would suggest to stop doing this noisy stuff, which keeps people
> > > busy for nothing.
> >
> > Cleaning up old code is also worth something, even if does not change
> > one bit in the assembly output in the end...
> >
> > Alexander
> 
> Quite insignificant clean up it is that does more harm that gives any
> benefit as any new change adds debt to backporting.
> 
> Anyway, this has been a useful patch set for me in the sense that I have
> clearer picture now on discarding/accepting commits.

Indeed. I have now a better understanding for why some code looks as ugly as it 
does.

> One line minor
> clean up will be from now on automatic NAK unless it causes a compiler
> warning or some other visible side-effect.

Not a nice policy, but at least a policy. I have deleted the tasks that I had 
still planned for other cleanup activities.

Alexander


RE: [PATCH v4 1/4] tpm: move wait_for_tpm_stat() to respective driver files

2017-10-19 Thread Alexander.Steffen
> On Tue, Oct 17, 2017 at 04:32:29PM -0400, Nayna Jain wrote:
> > The function wait_for_tpm_stat() is currently defined in
> > tpm-interface file. It is a hardware specific function used
> > only by tpm_tis and xen-tpmfront, so it is removed from
> > tpm-interface.c and defined in respective driver files.
> >
> > Suggested-by: Jarkko Sakkinen 
> > Signed-off-by: Nayna Jain 
> > Reviewed-by: Jarkko Sakkinen 
> > ---
> >  drivers/char/tpm/tpm-interface.c | 60 
> > 
> >  drivers/char/tpm/tpm.h   |  2 --
> >  drivers/char/tpm/tpm_tis_core.c  | 60
> 
> >  drivers/char/tpm/xen-tpmfront.c  | 60
> 
> >  4 files changed, 120 insertions(+), 62 deletions(-)
> >
> > diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-
> interface.c
> > index 1d6729be4cd6..313f7618d569 100644
> > --- a/drivers/char/tpm/tpm-interface.c
> > +++ b/drivers/char/tpm/tpm-interface.c
> > @@ -1035,66 +1035,6 @@ int tpm_send(u32 chip_num, void *cmd, size_t
> buflen)
> >  }
> >  EXPORT_SYMBOL_GPL(tpm_send);
> >
> > -static bool wait_for_tpm_stat_cond(struct tpm_chip *chip, u8 mask,
> > -   bool check_cancel, bool *canceled)
> > -{
> > -   u8 status = chip->ops->status(chip);
> > -
> > -   *canceled = false;
> > -   if ((status & mask) == mask)
> > -   return true;
> > -   if (check_cancel && chip->ops->req_canceled(chip, status)) {
> > -   *canceled = true;
> > -   return true;
> > -   }
> > -   return false;
> > -}
> > -
> > -int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask, unsigned long
> timeout,
> > - wait_queue_head_t *queue, bool check_cancel)
> > -{
> > -   unsigned long stop;
> > -   long rc;
> > -   u8 status;
> > -   bool canceled = false;
> > -
> > -   /* check current status */
> > -   status = chip->ops->status(chip);
> > -   if ((status & mask) == mask)
> > -   return 0;
> > -
> > -   stop = jiffies + timeout;
> > -
> > -   if (chip->flags & TPM_CHIP_FLAG_IRQ) {
> > -again:
> > -   timeout = stop - jiffies;
> > -   if ((long)timeout <= 0)
> > -   return -ETIME;
> > -   rc = wait_event_interruptible_timeout(*queue,
> > -   wait_for_tpm_stat_cond(chip, mask, check_cancel,
> > -  ),
> > -   timeout);
> > -   if (rc > 0) {
> > -   if (canceled)
> > -   return -ECANCELED;
> > -   return 0;
> > -   }
> > -   if (rc == -ERESTARTSYS && freezing(current)) {
> > -   clear_thread_flag(TIF_SIGPENDING);
> > -   goto again;
> > -   }
> > -   } else {
> > -   do {
> > -   tpm_msleep(TPM_TIMEOUT);
> > -   status = chip->ops->status(chip);
> > -   if ((status & mask) == mask)
> > -   return 0;
> > -   } while (time_before(jiffies, stop));
> > -   }
> > -   return -ETIME;
> > -}
> > -EXPORT_SYMBOL_GPL(wait_for_tpm_stat);
> > -
> >  #define TPM_ORD_SAVESTATE 152
> >  #define SAVESTATE_RESULT_SIZE 10
> >
> > diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
> > index 2d5466a72e40..4fc83ac7abeb 100644
> > --- a/drivers/char/tpm/tpm.h
> > +++ b/drivers/char/tpm/tpm.h
> > @@ -525,8 +525,6 @@ int tpm_do_selftest(struct tpm_chip *chip);
> >  unsigned long tpm_calc_ordinal_duration(struct tpm_chip *chip, u32
> ordinal);
> >  int tpm_pm_suspend(struct device *dev);
> >  int tpm_pm_resume(struct device *dev);
> > -int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask, unsigned long
> timeout,
> > - wait_queue_head_t *queue, bool check_cancel);
> >
> >  static inline void tpm_msleep(unsigned int delay_msec)
> >  {
> > diff --git a/drivers/char/tpm/tpm_tis_core.c
> b/drivers/char/tpm/tpm_tis_core.c
> > index 63bc6c3b949e..b33126a35694 100644
> > --- a/drivers/char/tpm/tpm_tis_core.c
> > +++ b/drivers/char/tpm/tpm_tis_core.c
> > @@ -31,6 +31,66 @@
> >  #include "tpm.h"
> >  #include "tpm_tis_core.h"
> >
> > +static bool wait_for_tpm_stat_cond(struct tpm_chip *chip, u8 mask,
> > +   bool check_cancel, bool *canceled)
> > +{
> > +   u8 status = chip->ops->status(chip);
> > +
> > +   *canceled = false;
> > +   if ((status & mask) == mask)
> > +   return true;
> > +   if (check_cancel && chip->ops->req_canceled(chip, status)) {
> > +   *canceled = true;
> > +   return true;
> > +   }
> > +   return false;
> > +}
> > +
> > +static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
> > +   unsigned long timeout, wait_queue_head_t *queue,
> > +   bool check_cancel)
> > +{
> > +   unsigned long stop;
> > +   long rc;
> > +   u8 status;
> > +   bool canceled = false;
> > +
> > +   /* check current status */
> > +   status = 

RE: [PATCH 3/4] char/tpm: Improve a size determination in nine functions

2017-10-20 Thread Alexander.Steffen
> On Fri, Oct 20, 2017 at 12:01:39PM +0300, Jarkko Sakkinen wrote:
> > On Thu, Oct 19, 2017 at 04:58:23PM +,
> alexander.stef...@infineon.com wrote:
> > > > On Tue, Oct 17, 2017 at 11:50:05AM +,
> alexander.stef...@infineon.com
> > > > wrote:
> > > > > > > Replace the specification of data structures by pointer
> dereferences
> > > > > > > as the parameter for the operator "sizeof" to make the
> corresponding
> > > > > > > size
> > > > > > > determination a bit safer according to the Linux coding style
> > > > > > > convention.
> > > > > >
> > > > > >
> > > > > > This patch does one style in favor of the other.
> > > > >
> > > > > I actually prefer that style, so I'd welcome this change :)
> > > > >
> > > > > > At the end it's Jarkko's call, though I would NAK this as I think 
> > > > > > some
> > > > > > one already told this to you for some other similar patch(es).
> > > > > >
> > > > > >
> > > > > > I even would suggest to stop doing this noisy stuff, which keeps
> people
> > > > > > busy for nothing.
> > > > >
> > > > > Cleaning up old code is also worth something, even if does not change
> > > > > one bit in the assembly output in the end...
> > > > >
> > > > > Alexander
> > > >
> > > > Quite insignificant clean up it is that does more harm that gives any
> > > > benefit as any new change adds debt to backporting.
> > > >
> > > > Anyway, this has been a useful patch set for me in the sense that I have
> > > > clearer picture now on discarding/accepting commits.
> > >
> > > Indeed. I have now a better understanding for why some code looks as
> > > ugly as it does.
> > >
> > > > One line minor
> > > > clean up will be from now on automatic NAK unless it causes a compiler
> > > > warning or some other visible side-effect.
> > >
> > > Not a nice policy, but at least a policy. I have deleted the tasks
> > > that I had still planned for other cleanup activities.
> > >
> > > Alexander
> >
> > 1/4 and 2/4 are sensible clean ups as long as the commit message is
> > refined.
> >
> > Moving more functions to use tpm_buf instead of nasty tpm_cmd_t are
> > also welcome changes.
> >
> > Documenting functions (exported mainly) is also welcome. Or refining
> > documentation.
> >
> > It's really case by case. The important thing in small clean ups is
> > a clearly written commit message that explains rationale.
> >
> > /Jarkko
> 
> It's easy to say in retroperpective that code is "ugly". I would use
> strong consideration before using that adjective for mainline code.
> Rarely when you do something new first the form will be polished.

(Let's not argue over words, not being a native speaker, I'll probably not 
choose the perfect expression all the time.)

My assumption is that in many cases the code was not like that from the 
beginning. Over time, new functionality got added, interfaces expanded, etc. 
But when the goal tends to be to keep the changes minimal, then it is only 
natural for everyone to be focused on their small parts of the code, and not 
clean up the surrounding areas (or better integrate with them).

> I was too steep with the policy above. It's not exactly like that in the
> strict sense. It's always case by case like for any commit. However, it
> is good to remember that "ugliness" does not cause regressions.

Not by itself, no. But it causes cognitive strain while working with the code, 
and that might lead to bugs.

Alexander


RE: [PATCH v4 2/4] tpm: ignore burstcount to improve tpm_tis send() performance

2017-10-20 Thread Alexander.Steffen
> The TPM burstcount status indicates the number of bytes that can
> be sent to the TPM without causing bus wait states.  Effectively,
> it is the number of empty bytes in the command FIFO.
> 
> This patch optimizes the tpm_tis_send_data() function by checking
> the burstcount only once. And if the burstcount is valid, it writes
> all the bytes at once, permitting wait state.
> 
> After this change, performance on a TPM 1.2 with an 8 byte
> burstcount for 1000 extends improved from ~41sec to ~14sec.
> 
> Suggested-by: Ken Goldman  in
> conjunction with the TPM Device Driver work group.
> Signed-off-by: Nayna Jain 
> Acked-by: Mimi Zohar 
> ---
>  drivers/char/tpm/tpm_tis_core.c | 42 +++--
> 
>  1 file changed, 15 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/char/tpm/tpm_tis_core.c
> b/drivers/char/tpm/tpm_tis_core.c
> index b33126a35694..993328ae988c 100644
> --- a/drivers/char/tpm/tpm_tis_core.c
> +++ b/drivers/char/tpm/tpm_tis_core.c
> @@ -316,7 +316,6 @@ static int tpm_tis_send_data(struct tpm_chip *chip,
> u8 *buf, size_t len)
>  {
>   struct tpm_tis_data *priv = dev_get_drvdata(>dev);
>   int rc, status, burstcnt;
> - size_t count = 0;
>   bool itpm = priv->flags & TPM_TIS_ITPM_WORKAROUND;
> 
>   status = tpm_tis_status(chip);
> @@ -330,35 +329,24 @@ static int tpm_tis_send_data(struct tpm_chip *chip,
> u8 *buf, size_t len)
>   }
>   }
> 
> - while (count < len - 1) {
> - burstcnt = get_burstcount(chip);
> - if (burstcnt < 0) {
> - dev_err(>dev, "Unable to read burstcount\n");
> - rc = burstcnt;
> - goto out_err;
> - }
> - burstcnt = min_t(int, burstcnt, len - count - 1);
> - rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv-
> >locality),
> -  burstcnt, buf + count);
> - if (rc < 0)
> - goto out_err;
> -
> - count += burstcnt;
> -
> - if (wait_for_tpm_stat(chip, TPM_STS_VALID, chip-
> >timeout_c,
> - >int_queue, false) < 0) {
> - rc = -ETIME;
> - goto out_err;
> - }
> - status = tpm_tis_status(chip);
> - if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0) {
> - rc = -EIO;
> - goto out_err;
> - }
> + /*
> +  * Get the initial burstcount to ensure TPM is ready to
> +  * accept data.
> +  */
> + burstcnt = get_burstcount(chip);
> + if (burstcnt < 0) {
> + dev_err(>dev, "Unable to read burstcount\n");
> + rc = burstcnt;
> + goto out_err;
>   }
> 
> + rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv->locality),
> + len - 1, buf);
> + if (rc < 0)
> + goto out_err;
> +
>   /* write last byte */
> - rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality),
> buf[count]);
> + rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality), buf[len-
> 1]);
>   if (rc < 0)
>   goto out_err;
> 
> --
> 2.13.3

This seems to fail reliably with my SPI TPM 2.0. I get EIO when trying to send 
large amounts of data, e.g. with TPM2_Hash, and subsequent tests seem to take 
an unusual amount of time. More analysis probably has to wait until November, 
since I am going to be in Prague next week.

Alexander


RE: [PATCH v4 2/4] tpm: ignore burstcount to improve tpm_tis send() performance

2017-11-16 Thread Alexander.Steffen
> > On 10/20/2017 08:12 PM, alexander.stef...@infineon.com wrote:
> > >> The TPM burstcount status indicates the number of bytes that can
> > >> be sent to the TPM without causing bus wait states.  Effectively,
> > >> it is the number of empty bytes in the command FIFO.
> > >>
> > >> This patch optimizes the tpm_tis_send_data() function by checking
> > >> the burstcount only once. And if the burstcount is valid, it writes
> > >> all the bytes at once, permitting wait state.
> > >>
> > >> After this change, performance on a TPM 1.2 with an 8 byte
> > >> burstcount for 1000 extends improved from ~41sec to ~14sec.
> > >>
> > >> Suggested-by: Ken Goldman  in
> > >> conjunction with the TPM Device Driver work group.
> > >> Signed-off-by: Nayna Jain
> > >> Acked-by: Mimi Zohar
> > >> ---
> > >>   drivers/char/tpm/tpm_tis_core.c | 42 +++
> --
> > 
> > >> 
> > >>   1 file changed, 15 insertions(+), 27 deletions(-)
> > >>
> > >> diff --git a/drivers/char/tpm/tpm_tis_core.c
> > >> b/drivers/char/tpm/tpm_tis_core.c
> > >> index b33126a35694..993328ae988c 100644
> > >> --- a/drivers/char/tpm/tpm_tis_core.c
> > >> +++ b/drivers/char/tpm/tpm_tis_core.c
> > >> @@ -316,7 +316,6 @@ static int tpm_tis_send_data(struct tpm_chip
> > *chip,
> > >> u8 *buf, size_t len)
> > >>   {
> > >>  struct tpm_tis_data *priv = dev_get_drvdata(>dev);
> > >>  int rc, status, burstcnt;
> > >> -size_t count = 0;
> > >>  bool itpm = priv->flags & TPM_TIS_ITPM_WORKAROUND;
> > >>
> > >>  status = tpm_tis_status(chip);
> > >> @@ -330,35 +329,24 @@ static int tpm_tis_send_data(struct tpm_chip
> > *chip,
> > >> u8 *buf, size_t len)
> > >>  }
> > >>  }
> > >>
> > >> -while (count < len - 1) {
> > >> -burstcnt = get_burstcount(chip);
> > >> -if (burstcnt < 0) {
> > >> -dev_err(>dev, "Unable to read 
> > >> burstcount\n");
> > >> -rc = burstcnt;
> > >> -goto out_err;
> > >> -}
> > >> -burstcnt = min_t(int, burstcnt, len - count - 1);
> > >> -rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv-
> > >>> locality),
> > >> - burstcnt, buf + count);
> > >> -if (rc < 0)
> > >> -goto out_err;
> > >> -
> > >> -count += burstcnt;
> > >> -
> > >> -if (wait_for_tpm_stat(chip, TPM_STS_VALID, chip-
> > >>> timeout_c,
> > >> ->int_queue, false) < 0) {
> > >> -rc = -ETIME;
> > >> -goto out_err;
> > >> -}
> > >> -status = tpm_tis_status(chip);
> > >> -if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0) {
> > >> -rc = -EIO;
> > >> -goto out_err;
> > >> -}
> > >> +/*
> > >> + * Get the initial burstcount to ensure TPM is ready to
> > >> + * accept data.
> > >> + */
> > >> +burstcnt = get_burstcount(chip);
> > >> +if (burstcnt < 0) {
> > >> +dev_err(>dev, "Unable to read burstcount\n");
> > >> +rc = burstcnt;
> > >> +goto out_err;
> > >>  }
> > >>
> > >> +rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv->locality),
> > >> +len - 1, buf);
> > >> +if (rc < 0)
> > >> +goto out_err;
> > >> +
> > >>  /* write last byte */
> > >> -rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality),
> > >> buf[count]);
> > >> +rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality), 
> > >> buf[len-
> > >> 1]);
> > >>  if (rc < 0)
> > >>  goto out_err;
> > >>
> > >> --
> > >> 2.13.3
> > > This seems to fail reliably with my SPI TPM 2.0. I get EIO when trying to
> > send large amounts of data, e.g. with TPM2_Hash, and subsequent tests
> > seem to take an unusual amount of time. More analysis probably has to
> wait
> > until November, since I am going to be in Prague next week.
> >
> > Thanks Alex for testing these.. Did you get the chance to do any further
> > analysis ?
> 
> I am working on that now. Ken's suggestion seems reasonable, so I am going
> to test whether correctly waiting for the flags to change fixes the problem. 
> If
> it does, I'll send the patches.

Sorry for the delay, I had to take care of some device tree changes in v4.14 
that broke my ARM test machines.

I've implemented some patches that fix the issue that Ken pointed out and 
rebased your patch 2/4 ("ignore burstcount") on top. While doing this I noticed 
that your original patch does not, as the commit message says, write all the 
bytes at once, but still unnecessarily splits all commands into at least two 
transfers (as did the original code). I've 

RE: [PATCH] tpm: do handle area size validation only when TPM space used

2017-03-28 Thread Alexander.Steffen
Yes, this fixes the issue for me. Thanks.

Alexander

> -Original Message-
> From: Jarkko Sakkinen [mailto:jarkko.sakki...@linux.intel.com]
> Sent: Tuesday, March 28, 2017 12:25 PM
> To: Steffen Alexander (IFAG CCS ESS D SW A)
> Cc: tpmdd-de...@lists.sourceforge.net; linux-security-
> mod...@vger.kernel.org; Peter Huewe; Marcel Selhorst; Jason Gunthorpe;
> open list
> Subject: Re: [PATCH] tpm: do handle area size validation only when TPM
> space used
> 
> So do you need this or not?
> 
> /Jarkko
> 
> On Mon, Mar 27, 2017 at 12:08:15AM +0300, Jarkko Sakkinen wrote:
> > In order to not cause backwards compatibility issues with
> > /dev/tpm0 disable handle area size validation if tpm_transmit is not
> > called with a TPM space.
> >
> > Signed-off-by: Jarkko Sakkinen 
> > ---
> >  drivers/char/tpm/tpm-interface.c | 9 +++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/char/tpm/tpm-interface.c
> > b/drivers/char/tpm/tpm-interface.c
> > index bf0c3fa..158c1db 100644
> > --- a/drivers/char/tpm/tpm-interface.c
> > +++ b/drivers/char/tpm/tpm-interface.c
> > @@ -328,7 +328,9 @@ unsigned long tpm_calc_ordinal_duration(struct
> > tpm_chip *chip,  }  EXPORT_SYMBOL_GPL(tpm_calc_ordinal_duration);
> >
> > -static bool tpm_validate_command(struct tpm_chip *chip, const u8
> > *cmd,
> > +static bool tpm_validate_command(struct tpm_chip *chip,
> > +struct tpm_space *space,
> > +const u8 *cmd,
> >  size_t len)
> >  {
> > const struct tpm_input_header *header = (const void *)cmd; @@ -
> 340,6
> > +342,9 @@ static bool tpm_validate_command(struct tpm_chip *chip,
> const u8 *cmd,
> > if (len < TPM_HEADER_SIZE)
> > return false;
> >
> > +   if (!space)
> > +   return true;
> > +
> > if (chip->flags & TPM_CHIP_FLAG_TPM2 && chip->nr_commands) {
> > cc = be32_to_cpu(header->ordinal);
> >
> > @@ -386,7 +391,7 @@ ssize_t tpm_transmit(struct tpm_chip *chip, struct
> tpm_space *space,
> > unsigned long stop;
> > bool need_locality;
> >
> > -   if (!tpm_validate_command(chip, buf, bufsiz))
> > +   if (!tpm_validate_command(chip, space, buf, bufsiz))
> > return -EINVAL;
> >
> > if (bufsiz > TPM_BUFSIZE)
> > --
> > 2.9.3
> >



RE: [tpmdd-devel] [PATCH v3 2/7] tpm: validate TPM 2.0 commands

2017-03-21 Thread Alexander.Steffen
> > There are a few special cases that need some thought though. For
> > example, it is possible to use an upgrade to switch the TPM family
> > from 1.2 to 2.0 (or vice versa). In this case it seems useful to let
> > the kernel reinitialize the TPM driver, so it uses the correct
> > timeouts for communication, activates the correct features (resource
> > manager or not?), etc., without needing to reboot the system.
> 
> In practice, would a TPM upgrade from TPM 1.2 to TPM 2.0 even occur
> without a reboot?  Is it an important use case?
> 
> 1 - It would leave the SHA-256 PCRs in the reset state.
> 
> 2 - It's possible that this upgrade would also require a BIOS upgrade.

For a traditional PC and when your goal is platform integrity, a reboot is 
probably the way to go. But in an embedded environment where there is no BIOS 
or if you use the TPM more like a smartcard just to store some keys (or 
generate random numbers), a reboot is unnecessary and it is more comfortable to 
avoid it.

We probably should inform the kernel before the upgrade anyway, so that it can 
shut down the TPM gracefully (and maybe switch to the upgrade mode, as Jason 
suggested). With that infrastructure in place, it does not seem like a lot of 
effort to also let it switch the TPM back to normal operation mode once the 
upgrade is complete.

Alexander


RE: [tpmdd-devel] [PATCH v3 2/7] tpm: validate TPM 2.0 commands

2017-03-17 Thread Alexander.Steffen
> Check for every TPM 2.0 command that the command code is supported and
> the command buffer has at least the length that can contain the header
> and the handle area.

This breaks several use cases for me:

1. I've got a TPM that implements vendor-specific command codes. Those cannot 
be send to the TPM anymore, but are rejected with EINVAL.

2. When upgrading the firmware on my TPM, it switches to a non-standard 
communication mode for the upgrade process and does not communicate using 
TPM2.0 commands during this time. Rejecting non-TPM2.0 commands means upgrading 
won't be possible anymore.

3. I'd like to use the kernel driver to test my TPM implementation. So for 
example, I send an invalid command code to the TPM and expect 
TPM_RC_COMMAND_CODE in response, but now I get EINVAL instead and the TPM never 
sees the command.

From my point of view, the kernel driver should provide a transparent 
communication channel to the TPM. Whatever I write to /dev/tpm should arrive 
at the TPM device, so that the TPM can handle it and return the appropriate 
response. Otherwise, you'll end up reimplementing all the command handling 
logic, that is already part of the TPM's job, and as soon as you miss one case 
and behave differently than the TPM, something relying on this behavior will 
break.

I see two possible solutions:

1. When the driver does not know a command code, it passes through the command 
unmodified. This bears the risk of unknown side effects though, so TPM spaces 
might not be as independent as they should be.

2. Since the command code lookup is only really necessary for TPM spaces, it 
only gets activated when space != NULL. So the change will not affect 
/dev/tpm, but only the new /dev/tpmrm. As /dev/tpmrm is not meant to 
be a transparent interface anyway, rejecting unknown commands is acceptable.

Alexander


RE: [tpmdd-devel] [PATCH v3 4/7] tpm: infrastructure for TPM spaces

2017-03-17 Thread Alexander.Steffen
> diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-
> interface.c
> index 20b1fe3..db5ffe9 100644
> --- a/drivers/char/tpm/tpm-interface.c
> +++ b/drivers/char/tpm/tpm-interface.c
> @@ -376,11 +376,12 @@ static bool tpm_validate_command(struct
> tpm_chip *chip, const u8 *cmd,
>   * 0 when the operation is successful.
>   * A negative number for system errors (errno).
>   */
> -ssize_t tpm_transmit(struct tpm_chip *chip, const u8 *buf, size_t bufsiz,
> -  unsigned int flags)
> +ssize_t tpm_transmit(struct tpm_chip *chip, struct tpm_space *space,
> +  u8 *buf, size_t bufsiz, unsigned int flags)

When adding parameters, please also update the parameter documentation at the 
top of the function. It is missing for the new parameter "space" here.

Alexander


RE: [tpmdd-devel] [PATCH v3 2/7] tpm: validate TPM 2.0 commands

2017-03-20 Thread Alexander.Steffen
>>> 2. When upgrading the firmware on my TPM, it switches to a
>>> non-standard communication mode for the upgrade process and does not
>>> communicate using TPM2.0 commands during this time. Rejecting
>>> non-TPM2.0 commands means upgrading won't be possible anymore.
>>
>> How non standard? Is the basic header even there? Are the lengths and
>> status code right?
>>
>> This might be an argument to add a 'raw' ioctl or something specifically
>> for this special case.
>
> It follows the regular TPM command syntax and looks something like 1.2
> commands.

Yep, so most of it already works with the current implementation.

There are a few special cases that need some thought though. For example, it is 
possible to use an upgrade to switch the TPM family from 1.2 to 2.0 (or vice 
versa). In this case it seems useful to let the kernel reinitialize the TPM 
driver, so it uses the correct timeouts for communication, activates the 
correct features (resource manager or not?), etc., without needing to reboot 
the system.

Another problem arises when the upgrade process is interrupted, e.g. because 
power is lost. Then the TPM is stuck in its non-standard upgrade mode, so the 
kernel does not recognize it as a valid TPM device and does not export 
/dev/tpm. But without the device node the upgrader is unable to restart the 
upgrade process, leaving the TPM forever inaccessible.

I'll try to work on those problems in the coming weeks and provide the fixes. 
Any input is appreciated.

Alexander


RE: [tpmdd-devel] [PATCH v3 2/7] tpm: validate TPM 2.0 commands

2017-03-20 Thread Alexander.Steffen
> I think the most straight-forward way to sort this out would be to limit
> validation to the resource manager.

Sounds good to me.

> If I send a fix, would you care to test it?

Sure, will do.

Alexander