from:"David Howells"

Re: [PATCH] X.509: Partially revert patch to add validation against IMA MOK keyring

2016-01-06 Thread David Howells

David Howells  wrote:

> Partially revert commit 41c89b64d7184a780f12f2cccdabe65cb2408893:
> 
>   Author: Petko Manolov 
>   Date:   Wed Dec 2 17:47:55 2015 +0200
>   IMA: create machine owner and blacklist keyrings
> 
> The problem is that prep->trusted is a simple boolean and the additional
> x509_validate_trust() call doesn't therefore distinguish levels of
> trustedness, but is just OR'd with the result of validation against the
> system trusted keyring.
> 
> However, setting the trusted flag means that this key may be added to *any*
> trusted-only keyring - including the system trusted keyring.
> 
> Whilst I appreciate what the patch is trying to do, I don't think this is
> quite the right solution.

Please apply this to security/next.

Thanks,
David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH] X.509: Don't check the signature on apparently self-signed keys [ver #2]

2016-01-06 Thread David Howells

Mimi Zohar  wrote:

> Once the builtin keys are loaded onto the system keyring, isn't the
> system keyring locked?

No.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] X.509: Partially revert patch to add validation against IMA MOK keyring

2016-01-06 Thread David Howells

Partially revert commit 41c89b64d7184a780f12f2cccdabe65cb2408893:

Author: Petko Manolov 
Date:   Wed Dec 2 17:47:55 2015 +0200
IMA: create machine owner and blacklist keyrings

The problem is that prep->trusted is a simple boolean and the additional
x509_validate_trust() call doesn't therefore distinguish levels of
trustedness, but is just OR'd with the result of validation against the
system trusted keyring.

However, setting the trusted flag means that this key may be added to *any*
trusted-only keyring - including the system trusted keyring.

Whilst I appreciate what the patch is trying to do, I don't think this is
quite the right solution.

Signed-off-by: David Howells 
cc: Petko Manolov 
cc: Mimi Zohar 
cc: keyri...@vger.kernel.org
---

 crypto/asymmetric_keys/x509_public_key.c |2 --
 1 file changed, 2 deletions(-)

diff --git a/crypto/asymmetric_keys/x509_public_key.c 
b/crypto/asymmetric_keys/x509_public_key.c
index 9e9e5a6a9ed6..2a44b3752471 100644
--- a/crypto/asymmetric_keys/x509_public_key.c
+++ b/crypto/asymmetric_keys/x509_public_key.c
@@ -321,8 +321,6 @@ static int x509_key_preparse(struct key_preparsed_payload 
*prep)
goto error_free_cert;
} else if (!prep->trusted) {
ret = x509_validate_trust(cert, get_system_trusted_keyring());
-   if (ret)
-   ret = x509_validate_trust(cert, get_ima_mok_keyring());
if (!ret)
prep->trusted = 1;
}

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH] X.509: Don't check the signature on apparently self-signed keys [ver #2]

2016-01-06 Thread David Howells

Mimi Zohar  wrote:

> The x509_validate_trust() was originally added for IMA to ensure, on a
> secure boot system, a certificate chain of trust rooted in hardware.
> The IMA MOK keyring extends this certificate chain of trust to the
> running system.

The problem is that because 'trusted' is a boolean, a key in the IMA MOK
keyring will permit addition to the system keyring.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH] X.509: Don't check the signature on apparently self-signed keys [ver #2]

2016-01-05 Thread David Howells

Mimi Zohar  wrote:

> You're missing Petko's patch:
> 41c89b6 IMA: create machine owner and blacklist keyrings

It should also be cc'd to the keyrings mailing list.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH] X.509: Don't check the signature on apparently self-signed keys [ver #2]

2016-01-05 Thread David Howells

Mimi Zohar  wrote:

> You're missing Petko's patch:
> 41c89b6 IMA: create machine owner and blacklist keyrings

Hmmm...  This is wrong.  x509_key_preparse() shouldn't be polling the IMA MOK
keyring under all circumstances.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC PATCH] X.509: Don't treat self-signed keys specially

2016-01-05 Thread David Howells

Trust for a self-signed certificate can normally only be determined by
whether we obtained it from a trusted location (ie. it was built into the
kernel at compile time), so there's not really any point in checking it -
we could verify that the signature is valid, but it doesn't really tell us
anything if the signature checks out.

However, there's a bug in the code determining whether a certificate is
self-signed or not - if they have neither AKID nor SKID then we just assume
that the cert is self-signed, which may not be true.

Given this, remove the code that treats self-signed certs specially when it
comes to evaluating trustability and attempt to evaluate them as ordinary
signed certificates.  We then expect self-signed certificates to fail the
trustability check and be marked as untrustworthy in x509_key_preparse().

Note that there is the possibility of the trustability check on a
self-signed cert then succeeding.  This is most likely to happen when a
duplicate of the certificate is already on the trust keyring - in which
case it shouldn't be a problem.

Signed-off-by: David Howells 
cc: David Woodhouse 
cc: Mimi Zohar 
---

 crypto/asymmetric_keys/x509_public_key.c |   25 -
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/crypto/asymmetric_keys/x509_public_key.c 
b/crypto/asymmetric_keys/x509_public_key.c
index 2a44b3752471..26e1937af7f4 100644
--- a/crypto/asymmetric_keys/x509_public_key.c
+++ b/crypto/asymmetric_keys/x509_public_key.c
@@ -255,6 +255,9 @@ static int x509_validate_trust(struct x509_certificate 
*cert,
struct key *key;
int ret = 1;
 
+   if (!cert->akid_id || !cert->akid_skid)
+   return 1;
+
if (!trust_keyring)
return -EOPNOTSUPP;
 
@@ -312,17 +315,21 @@ static int x509_key_preparse(struct key_preparsed_payload 
*prep)
cert->pub->algo = pkey_algo[cert->pub->pkey_algo];
cert->pub->id_type = PKEY_ID_X509;
 
-   /* Check the signature on the key if it appears to be self-signed */
-   if ((!cert->akid_skid && !cert->akid_id) ||
-   asymmetric_key_id_same(cert->skid, cert->akid_skid) ||
-   asymmetric_key_id_same(cert->id, cert->akid_id)) {
-   ret = x509_check_signature(cert->pub, cert); /* self-signed */
-   if (ret < 0)
-   goto error_free_cert;
-   } else if (!prep->trusted) {
+   /* See if we can derive the trustability of this certificate.
+*
+* When it comes to self-signed certificates, we cannot evaluate
+* trustedness except by the fact that we obtained it from a trusted
+* location.  So we just rely on x509_validate_trust() failing in this
+* case.
+*
+* Note that there's a possibility of a self-signed cert matching a
+* cert that we have (most likely a duplicate that we already trust) -
+* in which case it will be marked trusted.
+*/
+   if (!prep->trusted) {
ret = x509_validate_trust(cert, get_system_trusted_keyring());
if (!ret)
-   prep->trusted = 1;
+   prep->trusted = true;
}
 
/* Propose a description */

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH] X.509: Don't check the signature on apparently self-signed keys [ver #2]

2016-01-05 Thread David Howells

David Howells  wrote:

> If a certificate is self-signed, don't bother checking the validity of the
> signature.  The cert cannot be checked by validation against the next one
> in the chain as this is the root of the chain.  Trust for this certificate
> can only be determined by whether we obtained it from a trusted location
> (ie. it was built into the kernel at compile time).
> 
> This also fixes a bug whereby certificates were being assumed to be
> self-signed if they had neither AKID nor SKID, the symptoms of which show
> up as an attempt to load a certificate failing with -ERANGE or -EBADMSG.
> This is produced from the RSA module when the result of calculating "m =
> s^e mod n" is checked.

Oops - I forgot to change the patch description.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC PATCH] X.509: Don't check the signature on apparently self-signed keys [ver #2]

2016-01-05 Thread David Howells

If a certificate is self-signed, don't bother checking the validity of the
signature.  The cert cannot be checked by validation against the next one
in the chain as this is the root of the chain.  Trust for this certificate
can only be determined by whether we obtained it from a trusted location
(ie. it was built into the kernel at compile time).

This also fixes a bug whereby certificates were being assumed to be
self-signed if they had neither AKID nor SKID, the symptoms of which show
up as an attempt to load a certificate failing with -ERANGE or -EBADMSG.
This is produced from the RSA module when the result of calculating "m =
s^e mod n" is checked.

Signed-off-by: David Howells 
cc: David Woodhouse 
cc: Mimi Zohar 
---

 crypto/asymmetric_keys/x509_public_key.c |   25 -
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/crypto/asymmetric_keys/x509_public_key.c 
b/crypto/asymmetric_keys/x509_public_key.c
index 2a44b3752471..26e1937af7f4 100644
--- a/crypto/asymmetric_keys/x509_public_key.c
+++ b/crypto/asymmetric_keys/x509_public_key.c
@@ -255,6 +255,9 @@ static int x509_validate_trust(struct x509_certificate 
*cert,
struct key *key;
int ret = 1;
 
+   if (!cert->akid_id || !cert->akid_skid)
+   return 1;
+
if (!trust_keyring)
return -EOPNOTSUPP;
 
@@ -312,17 +315,21 @@ static int x509_key_preparse(struct key_preparsed_payload 
*prep)
cert->pub->algo = pkey_algo[cert->pub->pkey_algo];
cert->pub->id_type = PKEY_ID_X509;
 
-   /* Check the signature on the key if it appears to be self-signed */
-   if ((!cert->akid_skid && !cert->akid_id) ||
-   asymmetric_key_id_same(cert->skid, cert->akid_skid) ||
-   asymmetric_key_id_same(cert->id, cert->akid_id)) {
-   ret = x509_check_signature(cert->pub, cert); /* self-signed */
-   if (ret < 0)
-   goto error_free_cert;
-   } else if (!prep->trusted) {
+   /* See if we can derive the trustability of this certificate.
+*
+* When it comes to self-signed certificates, we cannot evaluate
+* trustedness except by the fact that we obtained it from a trusted
+* location.  So we just rely on x509_validate_trust() failing in this
+* case.
+*
+* Note that there's a possibility of a self-signed cert matching a
+* cert that we have (most likely a duplicate that we already trust) -
+* in which case it will be marked trusted.
+*/
+   if (!prep->trusted) {
ret = x509_validate_trust(cert, get_system_trusted_keyring());
if (!ret)
-   prep->trusted = 1;
+   prep->trusted = true;
}
 
/* Propose a description */

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC PATCH] X.509: Don't check the signature on apparently self-signed keys

2016-01-05 Thread David Howells

If a certificate is self-signed, don't bother checking the validity of the
signature.  The cert cannot be checked by validation against the next one
in the chain as this is the root of the chain.  Trust for this certificate
can only be determined by whether we obtained it from a trusted location
(ie. it was built into the kernel at compile time).

This also fixes a bug whereby certificates were being assumed to be
self-signed if they had neither AKID not SKID, the symptoms of which show
up as an attempt to load a certificate failing with -ERANGE or -EBADMSG.
This is produced from the RSA module when the result of calculating "m =
s^e mod n" is checked.

Signed-off-by: David Howells 
cc: David Woodhouse 
cc: Mimi Zohar 
---

 crypto/asymmetric_keys/x509_public_key.c |   15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/crypto/asymmetric_keys/x509_public_key.c 
b/crypto/asymmetric_keys/x509_public_key.c
index 2a44b3752471..663624225882 100644
--- a/crypto/asymmetric_keys/x509_public_key.c
+++ b/crypto/asymmetric_keys/x509_public_key.c
@@ -255,6 +255,9 @@ static int x509_validate_trust(struct x509_certificate 
*cert,
struct key *key;
int ret = 1;
 
+   if (!cert->akid_id || !cert->akid_skid)
+   return 1;
+   
if (!trust_keyring)
return -EOPNOTSUPP;
 
@@ -312,13 +315,13 @@ static int x509_key_preparse(struct key_preparsed_payload 
*prep)
cert->pub->algo = pkey_algo[cert->pub->pkey_algo];
cert->pub->id_type = PKEY_ID_X509;
 
-   /* Check the signature on the key if it appears to be self-signed */
-   if ((!cert->akid_skid && !cert->akid_id) ||
-   asymmetric_key_id_same(cert->skid, cert->akid_skid) ||
+   /* See if we can derive the trustability of this certificate */
+   if (asymmetric_key_id_same(cert->skid, cert->akid_skid) ||
asymmetric_key_id_same(cert->id, cert->akid_id)) {
-   ret = x509_check_signature(cert->pub, cert); /* self-signed */
-   if (ret < 0)
-   goto error_free_cert;
+   /* Self-signed.  We cannot evaluate the trustedness of this
+* cert, except by the fact that we obtained it from a trusted
+* location.
+*/
} else if (!prep->trusted) {
ret = x509_validate_trust(cert, get_system_trusted_keyring());
if (!ret)

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC PATCH 3/4] X.509: Support leap seconds

2016-01-04 Thread David Howells

The format of ASN.1 GeneralizedTime seems to be specified by ISO 8601
[X.680 46.3] and this apparently supports leap seconds (ie. the seconds
field is 60).  It's not entirely clear that ASN.1 expects it, but we can
relax the seconds check slightly for GeneralizedTime.

This results in us passing a time with sec as 60 to mktime64(), which
handles it as being a duplicate of the 0th second of the next minute.

We can't really do otherwise without giving the kernel much greater
knowledge of where all the leap seconds are.  Unfortunately, this would
require change the mapping of the kernel's current-time-in-seconds.

UTCTime, however, only supports a seconds value in the range 00-59, but for
the sake of simplicity allow this with UTCTime also.

Without this patch, certain X.509 certificates will be rejected,
potentially making a kernel unbootable.

Reported-by: Rudolf Polzer 
Signed-off-by: David Howells 
cc: Arnd Bergmann 
cc: David Woodhouse 
cc: John Stultz 
---

 crypto/asymmetric_keys/x509_cert_parser.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/crypto/asymmetric_keys/x509_cert_parser.c 
b/crypto/asymmetric_keys/x509_cert_parser.c
index 13c4e5a5fe8c..3379c0ba3988 100644
--- a/crypto/asymmetric_keys/x509_cert_parser.c
+++ b/crypto/asymmetric_keys/x509_cert_parser.c
@@ -550,7 +550,7 @@ int x509_decode_time(time64_t *_t,  size_t hdrlen,
if (day < 1 || day > mon_len ||
hour > 23 ||
min > 59 ||
-   sec > 59)
+   sec > 60) /* ISO 8601 permits leap seconds [X.680 46.3] */
goto invalid_time;
 
*_t = mktime64(year, mon, day, hour, min, sec);

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC PATCH 2/4] Handle ISO 8601 leap seconds and encodings of midnight in mktime64()

2016-01-04 Thread David Howells

Handle the following ISO 8601 features in mktime64():

 (1) Leap seconds.

 Leap seconds are indicated by the seconds parameter being the value
 60.  Handle this by treating it the same as 00 of the following
 minute.

 (2) Alternate encodings of midnight.

 Two different encodings of midnight are permitted - 00:00:00 and
 24:00:00 - the first is midnight today and the second is midnight
 tomorrow and is exactly equivalent to the first with tomorrow's date.

As it happens, we don't actually need to change mktime64() to handle either
of these - just comment them as valid parameters.

These facility will be used by the X.509 parser.  Doing it in mktime64()
makes the policy common to the whole kernel and easier to find.

Signed-off-by: David Howells 
cc: Arnd Bergmann 
cc: John Stultz 
---

 kernel/time/time.c |9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/kernel/time/time.c b/kernel/time/time.c
index 86751c68e08d..be115b020d27 100644
--- a/kernel/time/time.c
+++ b/kernel/time/time.c
@@ -322,6 +322,13 @@ EXPORT_SYMBOL(timespec_trunc);
  * -year/100+year/400 terms, and add 10.]
  *
  * This algorithm was first published by Gauss (I think).
+ *
+ * A leap second can be indicated by calling this function with sec as
+ * 60 (allowable under ISO 8601).  The leap second is treated the same
+ * as the following second since they don't exist in UNIX time.
+ *
+ * An encoding of midnight at the end of the day as 24:00:00 - ie. midnight
+ * tomorrow - (allowable under ISO 8601) is supported.
  */
 time64_t mktime64(const unsigned int year0, const unsigned int mon0,
const unsigned int day, const unsigned int hour,
@@ -338,7 +345,7 @@ time64_t mktime64(const unsigned int year0, const unsigned 
int mon0,
return time64_t)
  (year/4 - year/100 + year/400 + 367*mon/12 + day) +
  year*365 - 719499
-   )*24 + hour /* now have hours */
+   )*24 + hour /* now have hours - midnight tomorrow handled here */
  )*60 + min /* now have minutes */
)*60 + sec; /* finally seconds */
 }

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC PATCH 4/4] X.509: Handle midnight alternative notation in GeneralizedTime

2016-01-04 Thread David Howells

The ASN.1 GeneralizedTime object carries an ISO 8601 format date and time.
The time is permitted to show midnight as 00:00 or 24:00 (the latter being
equivalent of 00:00 of the following day).

The permitted value is checked in x509_decode_time() but the actual
handling is left to mktime64().

Without this patch, certain X.509 certificates will be rejected and could
lead to an unbootable kernel.

Note that with this patch we also permit any 24:mm:ss time and extend this
to UTCTime, which whilst not strictly correct don't permit much leeway in
fiddling date strings.

Reported-by: Rudolf Polzer 
Signed-off-by: David Howells 
cc: David Woodhouse 
cc: John Stultz 
cc: Arnd Bergmann 
---

 crypto/asymmetric_keys/x509_cert_parser.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/crypto/asymmetric_keys/x509_cert_parser.c 
b/crypto/asymmetric_keys/x509_cert_parser.c
index 3379c0ba3988..70ed0852fdb2 100644
--- a/crypto/asymmetric_keys/x509_cert_parser.c
+++ b/crypto/asymmetric_keys/x509_cert_parser.c
@@ -548,7 +548,7 @@ int x509_decode_time(time64_t *_t,  size_t hdrlen,
}
 
if (day < 1 || day > mon_len ||
-   hour > 23 ||
+   hour > 24 || /* ISO 8601 permits 24:00:00 as midnight tomorrow */
min > 59 ||
sec > 60) /* ISO 8601 permits leap seconds [X.680 46.3] */
goto invalid_time;

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC PATCH 1/4] X.509: Fix leap year handling again

2016-01-04 Thread David Howells

There are still a couple of minor issues in the X.509 leap year handling:

 (1) To avoid doing a modulus-by-400 in addition to a modulus-by-100 when
 determining whether the year is a leap year or not, I divided the year
 by 100 after doing the modulus-by-100, thereby letting the compiler do
 one instruction for both, and then did a modulus-by-4.

 Unfortunately, I then passed the now-modified year value to mktime64()
 to construct a time value.

 Since this isn't a fast path and since mktime64() does a bunch of
 divisions, just condense down to "% 400".  It's also easier to read.

 (2) The default month length for any February where the year doesn't
 divide by four exactly is obtained from the month_length[] array where
 the value is 29, not 28.

 This is fixed by altering the table.

Reported-by: Rudolf Polzer 
Signed-off-by: David Howells 
Acked-By: David Woodhouse 
cc: sta...@vger.kernel.org
---

 crypto/asymmetric_keys/x509_cert_parser.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/crypto/asymmetric_keys/x509_cert_parser.c 
b/crypto/asymmetric_keys/x509_cert_parser.c
index 021d39c0ba75..13c4e5a5fe8c 100644
--- a/crypto/asymmetric_keys/x509_cert_parser.c
+++ b/crypto/asymmetric_keys/x509_cert_parser.c
@@ -494,7 +494,7 @@ int x509_decode_time(time64_t *_t,  size_t hdrlen,
 unsigned char tag,
 const unsigned char *value, size_t vlen)
 {
-   static const unsigned char month_lengths[] = { 31, 29, 31, 30, 31, 30,
+   static const unsigned char month_lengths[] = { 31, 28, 31, 30, 31, 30,
   31, 31, 30, 31, 30, 31 };
const unsigned char *p = value;
unsigned year, mon, day, hour, min, sec, mon_len;
@@ -540,9 +540,9 @@ int x509_decode_time(time64_t *_t,  size_t hdrlen,
if (year % 4 == 0) {
mon_len = 29;
if (year % 100 == 0) {
-   year /= 100;
-   if (year % 4 != 0)
-   mon_len = 28;
+   mon_len = 28;
+   if (year % 400 == 0)
+   mon_len = 29;
}
}
}

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/4] X.509: Fix time handling

2016-01-04 Thread David Howells


Here's a set of patches that fix X.509 time handling in three ways:

 (1) Fix leap year handling.

 (2) Add leap second handling (where you get a time of 23:59:60).

 (3) Add end-of-day midnight encoding (where you get a time of 24:00:00).

David
---
David Howells (4):
  X.509: Fix leap year handling again
  Handle ISO 8601 leap seconds and encodings of midnight in mktime64()
  X.509: Support leap seconds
  X.509: Handle midnight alternative notation in GeneralizedTime


 crypto/asymmetric_keys/x509_cert_parser.c |   12 ++--
 kernel/time/time.c|9 -
 2 files changed, 14 insertions(+), 7 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [GIT PULL] Keys fixes

2015-12-18 Thread David Howells

Linus Torvalds  wrote:

> Side note: the key handling extra checks seem pretty pointless too.

Except that it has been argued that they have to be there or someone can use
dates that contribute to the signature to fake a signed content.  Admittedly
being able to have a seconds=60 value in somewhere that should stop at 59
doesn't allow a lot of contribution...

> There's no reason to have those "some time formats allow 60 seconds,
> some don't".

Feel free to explain that to the people who drafted the ASN.1 standards.
Maybe they'll listen to you...

> And you know what? If somebody decides that they want to have a key
> that says it was done at some nonsensical time like 24:30:60, just let
> it go. Just accept it. It's not your problem.

I've been told that it's a security hole.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [GIT PULL] Keys fixes

2015-12-18 Thread David Howells

Linus Torvalds  wrote:

> > David Howells (7):
> >   Handle leap seconds in mktime64()
> 
> This one is completely wrong.
> 
> Leap seconds are inserted *at* the minute, not at the secodn before the 
> minute.
> 
> So this code:
> 
> +   /* Handle leap seconds */
> +   if (sec == 60)
> +   sec = 59;
> 
> is just complete crap. Making the whole commit bogus and wrong.

I did ask on ksummit-discuss beforehand.  The advice was to treat hh:mm:60 as
hh:mm:59 rather than hh:mm+1:00.  Unless we actually support leap seconds as
distinct time_t values, it has to be one or the other.

> The code did the right thing wrt leap seconds before, without having
> any magical and incorrect special case. That commit makes it instead
> have two seconds of xx:xx:59.

... as opposed to two seconds of xx:xx+1:00.  You can argue it either way -
and arguably both are equally wrong since neither maps correctly to reality.

> The fact that people add extra code to make things extra wrong is
> annoying. The patch is marked as being cc'd to John Stultz, but I
> assume it was never acked, because I doubt he would ack something like
> this.
>
> To make things worse, this whole series seems to have existed for less
> than one day, and then it was sent to me as a pull request, however
> buggy and non-acked it was.

I only asked James to pass the CVE-labelled commit on to you and didn't
include it in a patch series.  The rest I posted hoping for reviews.

> To make things EVEN *more* broken, this crap was marked for stable.

It will theoretically need to end up there anyway, since it is technically
possible for the bugs to prevent a kernel from booting - just not very likely.
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] X.509: Fix determination of self-signedness

2015-12-18 Thread David Howells

Josh Boyer  wrote:

> Should this also be Cc'd to stable?

Argh.  Probably.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] KEYS: Fix race between read and revoke

2015-12-17 Thread David Howells

This fixes CVE-2015-7550.

There's a race between keyctl_read() and keyctl_revoke().  If the revoke
happens between keyctl_read() checking the validity of a key and the key's
semaphore being taken, then the key type read method will see a revoked key.

This causes a problem for the user-defined key type because it assumes in
its read method that there will always be a payload in a non-revoked key
and doesn't check for a NULL pointer.

Fix this by making keyctl_read() check the validity of a key after taking
semaphore instead of before.

I think the bug was introduced with the original keyrings code.

This was discovered by a multithreaded test program generated by syzkaller
(http://github.com/google/syzkaller).  Here's a cleaned up version:

#include 
#include 
#include 
void *thr0(void *arg)
{
key_serial_t key = (unsigned long)arg;
keyctl_revoke(key);
return 0;
}
void *thr1(void *arg)
{
key_serial_t key = (unsigned long)arg;
char buffer[16];
keyctl_read(key, buffer, 16);
return 0;
}
int main()
{
key_serial_t key = add_key("user", "%", "foo", 3, 
KEY_SPEC_USER_KEYRING);
pthread_t th[5];
pthread_create(&th[0], 0, thr0, (void *)(unsigned long)key);
pthread_create(&th[1], 0, thr1, (void *)(unsigned long)key);
pthread_create(&th[2], 0, thr0, (void *)(unsigned long)key);
pthread_create(&th[3], 0, thr1, (void *)(unsigned long)key);
pthread_join(th[0], 0);
pthread_join(th[1], 0);
pthread_join(th[2], 0);
pthread_join(th[3], 0);
return 0;
}

Build as:

cc -o keyctl-race keyctl-race.c -lkeyutils -lpthread

Run as:

while keyctl-race; do :; done

as it may need several iterations to crash the kernel.  The crash can be
summarised as:

BUG: unable to handle kernel NULL pointer dereference at 
0010
IP: [] user_read+0x56/0xa3
...
Call Trace:
 [] keyctl_read_key+0xb6/0xd7
 [] SyS_keyctl+0x83/0xe0
 [] entry_SYSCALL_64_fastpath+0x12/0x6f

Reported-by: Dmitry Vyukov 
Signed-off-by: David Howells 
Tested-by: Dmitry Vyukov 
Cc: sta...@vger.kernel.org
---

 security/keys/keyctl.c |   18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c
index fb111eafcb89..1c3872aeed14 100644
--- a/security/keys/keyctl.c
+++ b/security/keys/keyctl.c
@@ -751,16 +751,16 @@ long keyctl_read_key(key_serial_t keyid, char __user 
*buffer, size_t buflen)
 
/* the key is probably readable - now try to read it */
 can_read_key:
-   ret = key_validate(key);
-   if (ret == 0) {
-   ret = -EOPNOTSUPP;
-   if (key->type->read) {
-   /* read the data with the semaphore held (since we
-* might sleep) */
-   down_read(&key->sem);
+   ret = -EOPNOTSUPP;
+   if (key->type->read) {
+   /* Read the data with the semaphore held (since we might sleep)
+* to protect against the key being updated or revoked.
+*/
+   down_read(&key->sem);
+   ret = key_validate(key);
+   if (ret == 0)
ret = key->type->read(key, buffer, buflen);
-   up_read(&key->sem);
-   }
+   up_read(&key->sem);
}
 
 error2:

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/5] Handle leap seconds in mktime64()

2015-12-17 Thread David Howells

Handle leap seconds in mktime64() - where the seconds parameter is the
value 60 - by treating it the same as 59.

This facility will be used by the X.509 parser.  Doing it in mktime64()
makes the policy common to the whole kernel and easier to find.

Whilst we're at it, remove the const markers from all the parameters since
they don't really achieve anything and we do need to alter the sec
parameter.

Signed-off-by: David Howells 
cc: John Stultz 
cc: Arnd Bergmann 
cc: sta...@vger.kernel.org
---

 include/linux/time.h |   13 ++---
 kernel/time/time.c   |   14 +++---
 2 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/include/linux/time.h b/include/linux/time.h
index beebe3a02d43..35384f0c0aa2 100644
--- a/include/linux/time.h
+++ b/include/linux/time.h
@@ -39,17 +39,16 @@ static inline int timeval_compare(const struct timeval 
*lhs, const struct timeva
return lhs->tv_usec - rhs->tv_usec;
 }
 
-extern time64_t mktime64(const unsigned int year, const unsigned int mon,
-   const unsigned int day, const unsigned int hour,
-   const unsigned int min, const unsigned int sec);
+extern time64_t mktime64(unsigned int year, unsigned int mon,
+unsigned int day, unsigned int hour,
+unsigned int min, unsigned int sec);
 
 /**
  * Deprecated. Use mktime64().
  */
-static inline unsigned long mktime(const unsigned int year,
-   const unsigned int mon, const unsigned int day,
-   const unsigned int hour, const unsigned int min,
-   const unsigned int sec)
+static inline unsigned long mktime(unsigned int year, unsigned int mon,
+  unsigned int day, unsigned int hour,
+  unsigned int min, unsigned int sec)
 {
return mktime64(year, mon, day, hour, min, sec);
 }
diff --git a/kernel/time/time.c b/kernel/time/time.c
index 86751c68e08d..1858b10602f5 100644
--- a/kernel/time/time.c
+++ b/kernel/time/time.c
@@ -322,10 +322,14 @@ EXPORT_SYMBOL(timespec_trunc);
  * -year/100+year/400 terms, and add 10.]
  *
  * This algorithm was first published by Gauss (I think).
+ *
+ * A leap second can be indicated by calling this function with sec as
+ * 60 (allowable under ISO 8601).  The leap second is treated the same
+ * as the preceding second since they don't exist in UNIX time.
  */
-time64_t mktime64(const unsigned int year0, const unsigned int mon0,
-   const unsigned int day, const unsigned int hour,
-   const unsigned int min, const unsigned int sec)
+time64_t mktime64(unsigned int year0, unsigned int mon0,
+ unsigned int day, unsigned int hour,
+ unsigned int min, unsigned int sec)
 {
unsigned int mon = mon0, year = year0;
 
@@ -335,6 +339,10 @@ time64_t mktime64(const unsigned int year0, const unsigned 
int mon0,
year -= 1;
}
 
+   /* Handle leap seconds */
+   if (sec == 60)
+   sec = 59;
+
return time64_t)
  (year/4 - year/100 + year/400 + 367*mon/12 + day) +
  year*365 - 719499

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] X.509: Fix determination of self-signedness

2015-12-17 Thread David Howells

Fix determination of whether an X.509 certificate is self-signed or not.

It is currently assumed that a cert is self-signed if has no
authorityKeyIdentifier or the authorityKeyIdentifier matches the
subjectKeyIdentifier.  However, it is possible to encounter a certificate
that has neither AKID not SKID but is not self-signed.

This symptoms of this show up as an attempt to load a certificate failing
with -ERANGE or -EBADMSG, produced from the RSA module when the result of
calculating "m = s^e mod n" is checked.

To fix this, don't check to see if a certificate is self-signed if the
Issuer and Subject names differ.

Signed-off-by: David Howells 
cc: David Woodhouse 
---

 crypto/asymmetric_keys/x509_public_key.c |   11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/crypto/asymmetric_keys/x509_public_key.c 
b/crypto/asymmetric_keys/x509_public_key.c
index 2a44b3752471..6236e7996f19 100644
--- a/crypto/asymmetric_keys/x509_public_key.c
+++ b/crypto/asymmetric_keys/x509_public_key.c
@@ -313,9 +313,14 @@ static int x509_key_preparse(struct key_preparsed_payload 
*prep)
cert->pub->id_type = PKEY_ID_X509;
 
/* Check the signature on the key if it appears to be self-signed */
-   if ((!cert->akid_skid && !cert->akid_id) ||
-   asymmetric_key_id_same(cert->skid, cert->akid_skid) ||
-   asymmetric_key_id_same(cert->id, cert->akid_id)) {
+   if ((!cert->akid_skid && !cert->akid_id)) {
+   if (cert->raw_issuer_size == cert->raw_subject_size &&
+   memcmp(cert->raw_issuer, cert->raw_subject,
+  cert->raw_subject_size) == 0)
+   goto self_signed;
+   } else if (asymmetric_key_id_same(cert->skid, cert->akid_skid) ||
+  asymmetric_key_id_same(cert->id, cert->akid_id)) {
+self_signed:
ret = x509_check_signature(cert->pub, cert); /* self-signed */
if (ret < 0)
goto error_free_cert;

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/5] X.509: Fix leap year handling again

2015-12-17 Thread David Howells

There are still a couple of minor issues in the X.509 leap year handling:

 (1) To avoid doing a modulus-by-400 in addition to a modulus-by-100 when
 determining whether the year is a leap year or not, I divided the year
 by 100 after doing the modulus-by-100, thereby letting the compiler do
 one instruction for both, and then did a modulus-by-4.

 Unfortunately, I then passed the now-modified year value to mktime64()
 to construct a time value.

 Since this isn't a fast path and since mktime64() does a bunch of
 divisions, just condense down to "% 400".  It's also easier to read.

 (2) The default month length for any February where the year doesn't
 divide by four exactly is obtained from the month_length[] array where
 the value is 29, not 28.

 This is fixed by altering the table.

Reported-by: Rudolf Polzer 
Signed-off-by: David Howells 
Acked-By: David Woodhouse 
cc: sta...@vger.kernel.org
---

 crypto/asymmetric_keys/x509_cert_parser.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/crypto/asymmetric_keys/x509_cert_parser.c 
b/crypto/asymmetric_keys/x509_cert_parser.c
index 021d39c0ba75..13c4e5a5fe8c 100644
--- a/crypto/asymmetric_keys/x509_cert_parser.c
+++ b/crypto/asymmetric_keys/x509_cert_parser.c
@@ -494,7 +494,7 @@ int x509_decode_time(time64_t *_t,  size_t hdrlen,
 unsigned char tag,
 const unsigned char *value, size_t vlen)
 {
-   static const unsigned char month_lengths[] = { 31, 29, 31, 30, 31, 30,
+   static const unsigned char month_lengths[] = { 31, 28, 31, 30, 31, 30,
   31, 31, 30, 31, 30, 31 };
const unsigned char *p = value;
unsigned year, mon, day, hour, min, sec, mon_len;
@@ -540,9 +540,9 @@ int x509_decode_time(time64_t *_t,  size_t hdrlen,
if (year % 4 == 0) {
mon_len = 29;
if (year % 100 == 0) {
-   year /= 100;
-   if (year % 4 != 0)
-   mon_len = 28;
+   mon_len = 28;
+   if (year % 400 == 0)
+   mon_len = 29;
}
}
}

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/5] X.509: Support leap seconds

2015-12-17 Thread David Howells

The format of ASN.1 GeneralizedTime seems to be specified by ISO 8601
[X.680 46.3] and this apparently supports leap seconds (ie. the seconds
field is 60).  It's not entirely clear that ASN.1 expects it, but we can
relax the seconds check slightly for GeneralizedTime.

This, however, results in us passing a time with sec as 60 to mktime64()
which, unpatched, doesn't really handle such things.  What it will do is
equate the 60th second of a minute to the 0th second of the next minute.

We can't really do otherwise without giving the kernel much greater
knowledge of where all the leap seconds are.  Unfortunately, this would
require change the mapping of the kernel's current-time-in-seconds.

UTCTime, however, only supports a seconds value in the range 00-59.

Without this patch, certain X.509 certificates will be rejected,
potentially making a kernel unbootable.

Reported-by: Rudolf Polzer 
Signed-off-by: David Howells 
cc: David Woodhouse 
cc: John Stultz 
cc: Arnd Bergmann 
cc: sta...@vger.kernel.org
---

 crypto/asymmetric_keys/x509_cert_parser.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/crypto/asymmetric_keys/x509_cert_parser.c 
b/crypto/asymmetric_keys/x509_cert_parser.c
index 13c4e5a5fe8c..9be2caebc57b 100644
--- a/crypto/asymmetric_keys/x509_cert_parser.c
+++ b/crypto/asymmetric_keys/x509_cert_parser.c
@@ -497,7 +497,7 @@ int x509_decode_time(time64_t *_t,  size_t hdrlen,
static const unsigned char month_lengths[] = { 31, 28, 31, 30, 31, 30,
   31, 31, 30, 31, 30, 31 };
const unsigned char *p = value;
-   unsigned year, mon, day, hour, min, sec, mon_len;
+   unsigned year, mon, day, hour, min, sec, mon_len, max_sec;
 
 #define dec2bin(X) ({ unsigned char x = (X) - '0'; if (x > 9) goto 
invalid_time; x; })
 #define DD2bin(P) ({ unsigned x = dec2bin(P[0]) * 10 + dec2bin(P[1]); P += 2; 
x; })
@@ -511,6 +511,7 @@ int x509_decode_time(time64_t *_t,  size_t hdrlen,
year += 1900;
else
year += 2000;
+   max_sec = 59;
} else if (tag == ASN1_GENTIM) {
/* GenTime: MMDDHHMMSSZ */
if (vlen != 15)
@@ -518,6 +519,7 @@ int x509_decode_time(time64_t *_t,  size_t hdrlen,
year = DD2bin(p) * 100 + DD2bin(p);
if (year >= 1950 && year <= 2049)
goto invalid_time;
+   max_sec = 60; /* ISO 8601 permits leap seconds [X.680 46.3] */
} else {
goto unsupported_time;
}
@@ -550,7 +552,7 @@ int x509_decode_time(time64_t *_t,  size_t hdrlen,
if (day < 1 || day > mon_len ||
hour > 23 ||
min > 59 ||
-   sec > 59)
+   sec > max_sec)
goto invalid_time;
 
*_t = mktime64(year, mon, day, hour, min, sec);

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/5] Handle both ISO 8601 encodings of midnight in mktime64()

2015-12-17 Thread David Howells

ISO 8601 format dates permit two different encodings of midnight - 00:00:00
and 24:00:00 - the first is midnight today and the second is midnight
tomorrow and is exactly equivalent to the first with tomorrow's date.

Note that the implementation of mktime64() doesn't actually need to be
changed to handle this - the multiplication by 3600 of the hour will take
care of it automatically.  However, we should document that this handling
is done in mktime64() and is thus in a common place in the kernel.

This handling is required for X.509 certificate parsing which can be given
ISO 8601 dates.

Signed-off-by: David Howells 
cc: John Stultz 
cc: Arnd Bergmann 
cc: sta...@vger.kernel.org
---

 kernel/time/time.c |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/time/time.c b/kernel/time/time.c
index 1858b10602f5..56e7ada38471 100644
--- a/kernel/time/time.c
+++ b/kernel/time/time.c
@@ -326,6 +326,9 @@ EXPORT_SYMBOL(timespec_trunc);
  * A leap second can be indicated by calling this function with sec as
  * 60 (allowable under ISO 8601).  The leap second is treated the same
  * as the preceding second since they don't exist in UNIX time.
+ *
+ * An encoding of midnight at the end of the day as 24:00:00 - ie. midnight
+ * tomorrow - (allowable under ISO 8601) is supported.
  */
 time64_t mktime64(unsigned int year0, unsigned int mon0,
  unsigned int day, unsigned int hour,
@@ -346,7 +349,7 @@ time64_t mktime64(unsigned int year0, unsigned int mon0,
return time64_t)
  (year/4 - year/100 + year/400 + 367*mon/12 + day) +
  year*365 - 719499
-   )*24 + hour /* now have hours */
+   )*24 + hour /* now have hours - midnight tomorrow handled here */
  )*60 + min /* now have minutes */
)*60 + sec; /* finally seconds */
 }

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 5/5] X.509: Handle midnight alternative notation in GeneralizedTime

2015-12-17 Thread David Howells

The ASN.1 GeneralizedTime object carries an ISO8601 format date and time.
The time is permitted to show midnight as 00:00 or 24:00 (the latter being
equivalent of 00:00 of the following day).

The permitted value is checked in x509_decode_time() but the actual
handling is left to mktime64().

Without this patch, certain X.509 certificates will be rejected and could
lead to an unbootable kernel.

Reported-by: Rudolf Polzer 
Signed-off-by: David Howells 
cc: David Woodhouse 
cc: John Stultz 
cc: Arnd Bergmann 
cc: sta...@vger.kernel.org
---

 crypto/asymmetric_keys/x509_cert_parser.c |   12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/crypto/asymmetric_keys/x509_cert_parser.c 
b/crypto/asymmetric_keys/x509_cert_parser.c
index 9be2caebc57b..b9de251c419c 100644
--- a/crypto/asymmetric_keys/x509_cert_parser.c
+++ b/crypto/asymmetric_keys/x509_cert_parser.c
@@ -497,7 +497,7 @@ int x509_decode_time(time64_t *_t,  size_t hdrlen,
static const unsigned char month_lengths[] = { 31, 28, 31, 30, 31, 30,
   31, 31, 30, 31, 30, 31 };
const unsigned char *p = value;
-   unsigned year, mon, day, hour, min, sec, mon_len, max_sec;
+   unsigned year, mon, day, hour, min, sec, mon_len, max_sec, max_hour;
 
 #define dec2bin(X) ({ unsigned char x = (X) - '0'; if (x > 9) goto 
invalid_time; x; })
 #define DD2bin(P) ({ unsigned x = dec2bin(P[0]) * 10 + dec2bin(P[1]); P += 2; 
x; })
@@ -512,6 +512,7 @@ int x509_decode_time(time64_t *_t,  size_t hdrlen,
else
year += 2000;
max_sec = 59;
+   max_hour = 23;
} else if (tag == ASN1_GENTIM) {
/* GenTime: MMDDHHMMSSZ */
if (vlen != 15)
@@ -520,6 +521,7 @@ int x509_decode_time(time64_t *_t,  size_t hdrlen,
if (year >= 1950 && year <= 2049)
goto invalid_time;
max_sec = 60; /* ISO 8601 permits leap seconds [X.680 46.3] */
+   max_hour = 24;
} else {
goto unsupported_time;
}
@@ -550,11 +552,17 @@ int x509_decode_time(time64_t *_t,  size_t hdrlen,
}
 
if (day < 1 || day > mon_len ||
-   hour > 23 ||
+   hour > max_hour ||
min > 59 ||
sec > max_sec)
goto invalid_time;
 
+   /* GeneralizedTime, encoded as ISO 8601, also permits 24:00 today as an
+* alternative for 00:00 tomorrow.
+*/
+   if (hour == 24 && (min != 0 || sec != 0))
+   goto invalid_time;
+
*_t = mktime64(year, mon, day, hour, min, sec);
return 0;
 

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/5] X.509: Fix time handling

2015-12-17 Thread David Howells


Here's a set of patches that fix X.509 time handling in three ways:

 (1) Fix leap year handling.

 (2) Add leap second handling (where you get a time of 23:59:60).

 (3) Add end-of-day midnight encoding (where you get a time of 24:00:00).

David
---
David Howells (5):
  X.509: Fix leap year handling again
  Handle leap seconds in mktime64()
  X.509: Support leap seconds
  Handle both ISO 8601 encodings of midnight in mktime64()
  X.509: Handle midnight alternative notation in GeneralizedTime


 crypto/asymmetric_keys/x509_cert_parser.c |   24 +---
 include/linux/time.h  |   13 ++---
 kernel/time/time.c|   19 +++
 3 files changed, 38 insertions(+), 18 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] X.509: Fix the time validation [ver #3]

2015-12-11 Thread David Howells

Greg Kroah-Hartman  wrote:

> David, any reason you didn't put a cc: stable in the commit for it to be
> picked up in the stable releases?

I did cc it to stable.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] security: clarify that some code is really non-modular

2015-12-10 Thread David Howells

Paul Gortmaker  wrote:

> Paul Gortmaker (2):
>   security/keys: make big_key.c explicitly non-modular
>   security/integrity: make ima/ima_mok.c explicitly non-modular

Note that I only see patch 1.  Note also that keyri...@linux-nfs.org should
now be keyri...@vger.kernel.org.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] X.509: Fix leap year handling again and support leap seconds

2015-12-10 Thread David Howells

David Howells  wrote:

> > the leap second support still looks a bit suspect, as mktime64 will convert
> > mm/dd/ HH/MM/60 and mm/dd/ HH/MM+1/00 to the same time64_t,
> > essentially meaning that two different inputs can yield the same output,
> > possibly violating ASN.1 CER and DER rules.
> 
> That's a 'bug' in mktime64() not my parsing of the ASN.1.  If it's valid ASN.1
> then we should accept it.

Any thoughts on how to handle this?  I really want to push this patch
upstream.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] X.509: Fix leap year handling again and support leap seconds

2015-12-10 Thread David Howells

Rudolf Polzer  wrote:

> Also, while at it - apparently hour 24 is allowed by ISO 8601 too as long as
> minutes and seconds are zero, leading to even more non-canonicality... can
> you check whether this is also valid ASN.1 then?

Sorry, I missed this bit.  The ASN.1 spec says that GeneralizedTime is ISO
8601 format.

> > It's not entirely clear that ASN.1 expects it, but we can relax the
> > seconds check slightly for GeneralizedTime.

What I'm not sure of is whether other ASN.1 implementations will expect it.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: keyring timestamps

2015-12-01 Thread David Howells

Petko Manolov  wrote:

>   0) does keyrings keep a timestamp when created or last updated?  David?

No.

> 0) is crucial.  If there is no such thing as "time of the last update" for
> keyrings i guess we'll either have to implement it or use another mechanism
> to get similar result.

You haven't said why you want it?  Update what?

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] X.509: Fix leap year handling again and support leap seconds

2015-12-01 Thread David Howells

Rudolf Polzer  wrote:

> the leap second support still looks a bit suspect, as mktime64 will convert
> mm/dd/ HH/MM/60 and mm/dd/ HH/MM+1/00 to the same time64_t,
> essentially meaning that two different inputs can yield the same output,
> possibly violating ASN.1 CER and DER rules.

That's a 'bug' in mktime64() not my parsing of the ASN.1.  If it's valid ASN.1
then we should accept it.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] X.509: Fix leap year handling again and support leap seconds

2015-12-01 Thread David Howells

There are still a couple of minor issues in the X.509 leap year handling:

 (1) To avoid doing a modulus-by-400 in addition to a modulus-by-100 when
 determining whether the year is a leap year or not, I divided the year
 by 100 after doing the modulus-by-100, thereby letting the compiler do
 one instruction for both, and then did a modulus-by-4.

 Unfortunately, I then passed the now-modified year value to mktime64()
 to construct a time value.

 Since this isn't a fast path and since mktime64() does a bunch of
 divisions, just condense down to "% 400".  It's also easier to read.

 (2) The default month length for any February where the year doesn't
 divide by four exactly is obtained from the month_length[] array where
 the value is 29, not 28.

 This is fixed by altering the table.

In addition:

 (3) The format of ASN.1 GeneralizedTime seems to be specified by ISO 8601
 [X.680 46.3] and this apparently supports leap seconds (ie. the
 seconds field is 60).  It's not entirely clear that ASN.1 expects it,
 but we can relax the seconds check slightly for GeneralizedTime.

 UTCTime, however, only supports a seconds value in the range 00-59.

Reported-by: Rudolf Polzer 
Signed-off-by: David Howells 
---
diff --git a/crypto/asymmetric_keys/x509_cert_parser.c 
b/crypto/asymmetric_keys/x509_cert_parser.c
index 021d39c0ba75..f57c3c1b5ae7 100644
--- a/crypto/asymmetric_keys/x509_cert_parser.c
+++ b/crypto/asymmetric_keys/x509_cert_parser.c
@@ -494,10 +494,10 @@ int x509_decode_time(time64_t *_t,  size_t hdrlen,
 unsigned char tag,
 const unsigned char *value, size_t vlen)
 {
-   static const unsigned char month_lengths[] = { 31, 29, 31, 30, 31, 30,
+   static const unsigned char month_lengths[] = { 31, 28, 31, 30, 31, 30,
   31, 31, 30, 31, 30, 31 };
const unsigned char *p = value;
-   unsigned year, mon, day, hour, min, sec, mon_len;
+   unsigned year, mon, day, hour, min, sec, mon_len, sec_len;
 
 #define dec2bin(X) ({ unsigned char x = (X) - '0'; if (x > 9) goto 
invalid_time; x; })
 #define DD2bin(P) ({ unsigned x = dec2bin(P[0]) * 10 + dec2bin(P[1]); P += 2; 
x; })
@@ -511,6 +511,7 @@ int x509_decode_time(time64_t *_t,  size_t hdrlen,
year += 1900;
else
year += 2000;
+   max_sec = 59;
} else if (tag == ASN1_GENTIM) {
/* GenTime: MMDDHHMMSSZ */
if (vlen != 15)
@@ -518,6 +519,7 @@ int x509_decode_time(time64_t *_t,  size_t hdrlen,
year = DD2bin(p) * 100 + DD2bin(p);
if (year >= 1950 && year <= 2049)
goto invalid_time;
+   max_sec = 60; /* ISO 8601 permits leap seconds [X.680 46.3] */
} else {
goto unsupported_time;
}
@@ -540,9 +542,9 @@ int x509_decode_time(time64_t *_t,  size_t hdrlen,
if (year % 4 == 0) {
mon_len = 29;
if (year % 100 == 0) {
-   year /= 100;
-   if (year % 4 != 0)
-   mon_len = 28;
+   mon_len = 28;
+   if (year % 400 == 0)
+   mon_len = 29;
}
}
}
@@ -550,7 +552,7 @@ int x509_decode_time(time64_t *_t,  size_t hdrlen,
if (day < 1 || day > mon_len ||
hour > 23 ||
min > 59 ||
-   sec > 59)
+   sec > max_sec)
goto invalid_time;
 
*_t = mktime64(year, mon, day, hour, min, sec);
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/2] KEYS: Reserve an extra certificate symbol for inserting without recompiling

2015-11-26 Thread David Howells

Mehmet Kayaalp  wrote:

> Place a system_extra_cert buffer of configurable size, right after the
> system_certificate_list, so that inserted keys can be readily processed by
> the existing mechanism.

Do you have a particular use case for this?

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KEYS: Fix handling of stored error in a negatively instantiated user key

2015-11-25 Thread David Howells

James Morris  wrote:

> Is this triggerable by normal users?

Yes.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] KEYS: Fix handling of stored error in a negatively instantiated user key

2015-11-24 Thread David Howells

Hi James,

Can this be passed straight to Linus please?

Thanks,
David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] KEYS: Fix handling of stored error in a negatively instantiated user key

2015-11-24 Thread David Howells

If a user key gets negatively instantiated, an error code is cached in the
payload area.  A negatively instantiated key may be then be positively
instantiated by updating it with valid data.  However, the ->update key
type method must be aware that the error code may be there.

The following may be used to trigger the bug in the user key type:

keyctl request2 user user "" @u
keyctl add user user "a" @u

which manifests itself as:

BUG: unable to handle kernel paging request at ff8a
IP: [] __call_rcu.constprop.76+0x1f/0x280 
kernel/rcu/tree.c:3046
PGD 7cc30067 PUD 0
Oops: 0002 [#1] SMP
Modules linked in:
CPU: 3 PID: 2644 Comm: a.out Not tainted 4.3.0+ #49
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 
01/01/2011
task: 88003ddea700 ti: 88003dd88000 task.ti: 88003dd88000
RIP: 0010:[]  [] 
__call_rcu.constprop.76+0x1f/0x280
 [] __call_rcu.constprop.76+0x1f/0x280 
kernel/rcu/tree.c:3046
RSP: 0018:88003dd8bdb0  EFLAGS: 00010246
RAX: ff82 RBX:  RCX: 0001
RDX: 81e3fe40 RSI:  RDI: ff82
RBP: 88003dd8bde0 R08: 88007d2d2da0 R09: 
R10:  R11: 88003e8073c0 R12: ff82
R13: 88003dd8be68 R14: 88007d027600 R15: 88003ddea700
FS:  00b92880(0063) GS:88007fd0() 
knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: ff8a CR3: 7cc5f000 CR4: 06e0
Stack:
 88003dd8bdf0 81160a8a  ff82
 88003dd8be68 88007d027600 88003dd8bdf0 810a39e5
 88003dd8be20 812a31ab 88007d027600 88007d027620
Call Trace:
 [] kfree_call_rcu+0x15/0x20 kernel/rcu/tree.c:3136
 [] user_update+0x8b/0xb0 
security/keys/user_defined.c:129
 [< inline >] __key_update security/keys/key.c:730
 [] key_create_or_update+0x291/0x440 
security/keys/key.c:908
 [< inline >] SYSC_add_key security/keys/keyctl.c:125
 [] SyS_add_key+0x101/0x1e0 security/keys/keyctl.c:60
 [] entry_SYSCALL_64_fastpath+0x12/0x6a 
arch/x86/entry/entry_64.S:185

Note the error code (-ENOKEY) in EDX.

A similar bug can be tripped by:

keyctl request2 trusted user "" @u
keyctl add trusted user "a" @u

This should also affect encrypted keys - but that has to be correctly
parameterised or it will fail with EINVAL before getting to the bit that
will crashes.

Reported-by: Dmitry Vyukov 
Signed-off-by: David Howells 
Acked-by: Mimi Zohar 
---

 security/keys/encrypted-keys/encrypted.c |2 ++
 security/keys/trusted.c  |5 -
 security/keys/user_defined.c |5 -
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/security/keys/encrypted-keys/encrypted.c 
b/security/keys/encrypted-keys/encrypted.c
index 927db9f35ad6..696ccfa08d10 100644
--- a/security/keys/encrypted-keys/encrypted.c
+++ b/security/keys/encrypted-keys/encrypted.c
@@ -845,6 +845,8 @@ static int encrypted_update(struct key *key, struct 
key_preparsed_payload *prep)
size_t datalen = prep->datalen;
int ret = 0;
 
+   if (test_bit(KEY_FLAG_NEGATIVE, &key->flags))
+   return -ENOKEY;
if (datalen <= 0 || datalen > 32767 || !prep->data)
return -EINVAL;
 
diff --git a/security/keys/trusted.c b/security/keys/trusted.c
index 903dace648a1..16dec53184b6 100644
--- a/security/keys/trusted.c
+++ b/security/keys/trusted.c
@@ -1007,13 +1007,16 @@ static void trusted_rcu_free(struct rcu_head *rcu)
  */
 static int trusted_update(struct key *key, struct key_preparsed_payload *prep)
 {
-   struct trusted_key_payload *p = key->payload.data[0];
+   struct trusted_key_payload *p;
struct trusted_key_payload *new_p;
struct trusted_key_options *new_o;
size_t datalen = prep->datalen;
char *datablob;
int ret = 0;
 
+   if (test_bit(KEY_FLAG_NEGATIVE, &key->flags))
+   return -ENOKEY;
+   p = key->payload.data[0];
if (!p->migratable)
return -EPERM;
if (datalen <= 0 || datalen > 32767 || !prep->data)
diff --git a/security/keys/user_defined.c b/security/keys/user_defined.c
index 28cb30f80256..8705d79b2c6f 100644
--- a/security/keys/user_defined.c
+++ b/security/keys/user_defined.c
@@ -120,7 +120,10 @@ int user_update(struct key *key, struct 
key_preparsed_payload *prep)
 
if (ret == 0) {
/* attach the new data, displacing the old */
-   zap = key->payload.data[0];
+   if (!test_bit(

Re: [RFC] readlink()-related oddities

2015-11-20 Thread David Howells

Al Viro  wrote:

> All of them?  I see two kinds there - one is magical symlink (recognized
> by contents in afs_iget()), another is this autocell thing, the latter
> having no ->readlink().  Both serve as automount points, don't they?

The "autocell" thing is where you don't have an AFS file of that name and
lookup of that non-existent file as an attempt to mount a destination volume
encoded by the filename.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC] KEYS: Exposing {a,}symmetric key ops to userspace and other bits

2015-11-20 Thread David Howells

Hi Marcel, Mimi, Tadeus,

I want to consider adding or doing the following bits to the keyrings
facility, aiming for the next merge window:

 (*) Bring in the patches that I posted to change how the trust model on a
 keyring works.

 The model will then be that keys aren't automatically marked trusted, but
 linking a key into a keyring that is marked trusted-only will validate
 the key against the contents of the keyring before permitting its
 addition.

 Note that we can then vary the policy on a per-keyring basis.

 (*) Add Mimi's patches to allow keys/keyrings to be marked undeletable.  This
 is for the purpose of creating blacklists and to prevent people from
 removing entries in the blacklist.  Note that only the kernel can create
 a blacklist - we don't want userspace generating them as a way to take up
 kernel space.

 I think the right way to do this is to not allow marked keys to be
 unlinked from marked keyrings, but to allow marked keys to be unlinked
 from ordinary keyrings.

 The reason the 'keep' mark is required on individual keys is to prevent
 the keys from being directly revoked, expired or invalidated by keyctl
 without reference to the keyring.  Marked keys that are set expirable
 when they're created will still expire and be subsequently removed and if
 a marked key or marked keyring loses all its references it still gets
 gc'd.

 (*) Provide KEYCTL_{SIGN,VERIFY,ENCRYPT,DECRYPT} operations for use with
 asymmetric keys, allowing offload to hardware or use of the crypto
 routines for a software fallback.

 One question is as to how to set parameters.  The key will be specified
 by a key ID and this will set the crypto algorithm (eg. RSA, DSA, ECDSA,
 etc.) and the key size (eg. RSA-4096), but other parameters will need to
 be supplied such as:

 - Hash type.  I'm expecting the hash value to be passed through this
   interface not the data-to-be-hashed, but the type may need to be known
   for other purposes.

 - Password to decrypt the private key.  I'm not sure whether this should
   be presented at the point of key usage or the point of key
   instantiation.  The former means that you don't have an unsecured key
   sitting around in the kernel.

 Another question is what form the data should be presented.  In many
 ways, I would favour raw data with internal metadata attached as
 appropriate by userspace (eg. the hash algorithm OID included in a
 signature as per RFC4880 sec 5.2.2).  I would certainly rather avoid any
 ASN.1 or PGP encodings in this interface.

 One problem we have is that we only have four arguments to play with, one
 of which has to represent the key ID, but we need two buffers, two buffer
 lengths and some options per operation.  However, we could include the
 buffer lengths inside the options maybe:

keyctl_sign(int key, const char *options, const void *data,
void *buffer);

 Another option is to allow a key to be queried for the buffer sizes and
 always require that amount of data - maybe something like:

struct keyctl_asymmetric_info {
unsigned encrypted_data_size;
unsigned decrypted_data_size;
unsigned signature_size;
unsigned signed_data_size;
} info;
keyctl_query_asymmetric(key, &info);

 Possibly these values will all be the same, so we might only need get one
 value back.  I'm assuming here that userspace would do the dressing up of
 the data for signing with whatever metadata and padding is required.

 (*) In reference to the above, potentially provide a KEYCTL_KEY_UNLOCK that
 takes a key and password and gives you another key that has the private
 key unlocked that you can use temporarily and then discard.  I'm not sure
 how best to manage *hardware* private keys though - and I suspect that
 will be hardware dependent.

 (*) A TPM asymmetric key subtype that allows access to asymmetric keys stored
 in a TPM.

 (*) Provide KEYCTL_SEAL_KEY for sealing an asymmetric key to hardware.

 (*) Add a symmetric key type that acts as a container for a symmteric key,
 using either hardware or software, to be accessible through AF_ALG.

 (*) Provide a way to generate a new symmetric key, encrypting it with an
 asymmetric key inside the kernel.

 Again, how to parameterise is probably a tricky question.

 (*) Sort out the KEYCTL_UPDATE mess with trusted and encrypted keys.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] readlink()-related oddities

2015-11-20 Thread David Howells

Al Viro  wrote:

> How would those tools know that this particular pathname _is_ a magical
> symlink?  Sure, if you see a symlink with body that starts with % or #,
> you could figure out that it's not a regular one and go parse the body,
> but for stat(2) it looks like a directory.  Do those tools call readlink()
> on every directory they spot on AFS volume?  David?

It has to be a directory so that you can mount on it.  If you look in /afs on
an OpenAFS client filesystem it appears as a symlink to somewhere under /afs/
because they can't do the in-kernel mounting (it's GPL-only on Linux).

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC] readlink()-related oddities

2015-11-20 Thread David Howells

Al Viro  wrote:

> 3) normally, readlink(2) fails for non-symlinks.  Moreover, according to
> POSIX it should do so (with -EINVAL).  There is a pathological case when
> it succeeds for a directory, though.  Namely, one of the kinds of AFS
> "mountpoints".

All AFS mountpoints are magic symlinks that are specially interpreted by the
client as far as I'm aware.  I'm not sure why the designers didn't just select
a different file type for them, but they didn't.

Unfortunately, it means that iget has to read the contents of the symlinks :-/

> stat(2) reports those as directories, stepping into them leads to
> automounting a directory there (why do we have ->open() for them, BTW?).

I think I put that in to make sure the open() syscall returned EREMOTE rather
than another error if you tried to open it.  It can probably be removed
because with the d_automount code you can't ever get there I think - unless
you can pass AT_NO_AUTOMOUNT to openat().

> How the hell is userland supposed to guess to call readlink(2) on those
> suckers to get the information of what'll get automounted there if we step
> upon them?

There's an AFS userspace command that could be used to query a mountpoint that
was going to use it.  However, I suspect readlink() will now always trigger
the automount.  This is one of the things OpenAFS uses pioctl() for - but
since I'm not allowed to add that to the kernel, I have to find some other way
of doing it.

> And could we please get rid of that kludge?  David?

Sure.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] X.509: Fix the time validation [ver #3]

2015-11-12 Thread David Howells

This fixes CVE-2015-5327.  It affects kernels from 4.3-rc1 onwards.

Fix the X.509 time validation to use month number-1 when looking up the
number of days in that month.  Also put the month number validation before
doing the lookup so as not to risk overrunning the array.

This can be tested by doing the following:

cat <
Signed-off-by: David Howells 
Tested-by: Mimi Zohar 
Acked-by: David Woodhouse 
---

 crypto/asymmetric_keys/x509_cert_parser.c |   12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/crypto/asymmetric_keys/x509_cert_parser.c 
b/crypto/asymmetric_keys/x509_cert_parser.c
index 3000ea3b6687..021d39c0ba75 100644
--- a/crypto/asymmetric_keys/x509_cert_parser.c
+++ b/crypto/asymmetric_keys/x509_cert_parser.c
@@ -531,7 +531,11 @@ int x509_decode_time(time64_t *_t,  size_t hdrlen,
if (*p != 'Z')
goto unsupported_time;
 
-   mon_len = month_lengths[mon];
+   if (year < 1970 ||
+   mon < 1 || mon > 12)
+   goto invalid_time;
+
+   mon_len = month_lengths[mon - 1];
if (mon == 2) {
if (year % 4 == 0) {
mon_len = 29;
@@ -543,14 +547,12 @@ int x509_decode_time(time64_t *_t,  size_t hdrlen,
}
}
 
-   if (year < 1970 ||
-   mon < 1 || mon > 12 ||
-   day < 1 || day > mon_len ||
+   if (day < 1 || day > mon_len ||
hour > 23 ||
min > 59 ||
sec > 59)
goto invalid_time;
-   
+
*_t = mktime64(year, mon, day, hour, min, sec);
return 0;
 

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] X.509: Fix the time validation [ver #2]

2015-11-12 Thread David Howells

This fixes CVE-2015-5327.  It affects kernels from 4.3-rc1 onwards.

Fix the X.509 time validation to use month number-1 when looking up the
number of days in that month.  Also put the month number validation before
doing the lookup so as not to risk overrunning the array.

This can be tested by doing the following:

cat <
Signed-off-by: David Howells 
Tested-by: Mimi Zohar 
Acked-by: David Woodhouse 
---

 crypto/asymmetric_keys/x509_cert_parser.c |   12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/crypto/asymmetric_keys/x509_cert_parser.c 
b/crypto/asymmetric_keys/x509_cert_parser.c
index 3000ea3b6687..021d39c0ba75 100644
--- a/crypto/asymmetric_keys/x509_cert_parser.c
+++ b/crypto/asymmetric_keys/x509_cert_parser.c
@@ -531,7 +531,11 @@ int x509_decode_time(time64_t *_t,  size_t hdrlen,
if (*p != 'Z')
goto unsupported_time;
 
-   mon_len = month_lengths[mon];
+   if (year < 1970 ||
+   mon < 1 || mon > 12)
+   goto invalid_time;
+
+   mon_len = month_lengths[mon - 1];
if (mon == 2) {
if (year % 4 == 0) {
mon_len = 29;
@@ -543,14 +547,12 @@ int x509_decode_time(time64_t *_t,  size_t hdrlen,
}
}
 
-   if (year < 1970 ||
-   mon < 1 || mon > 12 ||
-   day < 1 || day > mon_len ||
+   if (day < 1 || day > mon_len ||
hour > 23 ||
min > 59 ||
sec > 59)
goto invalid_time;
-   
+
*_t = mktime64(year, mon, day, hour, min, sec);
return 0;
 

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] X.509: Fix the time validation

2015-11-11 Thread David Howells

This fixes CVE-2015-5327.  It affects kernels from 4.3-rc1 onwards.

Fix the X.509 time validation to use month number-1 when looking up the
number of days in that month.  Also put the month number validation before
doing the lookup so as not to risk overrunning the array.

This can be tested by doing the following:

cat <
Signed-off-by: David Howells 
Tested-by: Mimi Zohar 
Acked-by: David Woodhouse 
---

 crypto/asymmetric_keys/x509_cert_parser.c |   12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/crypto/asymmetric_keys/x509_cert_parser.c 
b/crypto/asymmetric_keys/x509_cert_parser.c
index af71878dc15b..e8d7b0342f5f 100644
--- a/crypto/asymmetric_keys/x509_cert_parser.c
+++ b/crypto/asymmetric_keys/x509_cert_parser.c
@@ -531,7 +531,11 @@ int x509_decode_time(time64_t *_t,  size_t hdrlen,
if (*p != 'Z')
goto unsupported_time;
 
-   mon_len = month_lengths[mon];
+   if (year < 1970 ||
+   mon < 1 || mon > 12)
+   goto invalid_time;
+
+   mon_len = month_lengths[mon - 1];
if (mon == 2) {
if (year % 4 == 0) {
mon_len = 29;
@@ -543,14 +547,12 @@ int x509_decode_time(time64_t *_t,  size_t hdrlen,
}
}
 
-   if (year < 1970 ||
-   mon < 1 || mon > 12 ||
-   day < 1 || day > mon_len ||
+   if (day < 1 || day > mon_len ||
hour < 0 || hour > 23 ||
min < 0 || min > 59 ||
sec < 0 || sec > 59)
goto invalid_time;
-   
+
*_t = mktime64(year, mon, day, hour, min, sec);
return 0;
 

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [GIT PULL] KEYS: Miscellaneous patches for next

2015-10-23 Thread David Howells

James Morris  wrote:

> Have these been in next yet?

No.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[GIT PULL] KEYS: Miscellaneous patches for next

2015-10-22 Thread David Howells

Hi James,

Could you pull these changes into your next branch please?

There are three groups:

 (1) Miscellaneous cleanups.

 (2) Add scripts for extracting system cert list and module sigs.

 (3) Condense the type-specific data in the key struct into the payload
 data as it doesn't really make any sense to keep them separate.

David
---
The following changes since commit 09302fd19efbff9569eaad3f78ead8f411defd87:

  Merge branch 'smack-for-4.4' of https://github.com/cschaufler/smack-next into 
next (2015-10-21 10:49:29 +1100)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git 
tags/keys-next-20151021

for you to fetch changes up to 146aa8b1453bd8f1ff2304ffb71b4ee0eb9acdcc:

  KEYS: Merge the type-specific data with the payload data (2015-10-21 15:18:36 
+0100)


Keyring changes for next

--------
David Howells (3):
  KEYS: Provide a script to extract the sys cert list from a vmlinux file
  KEYS: Provide a script to extract a module signature
  KEYS: Merge the type-specific data with the payload data

Geliang Tang (1):
  KEYS: use kvfree() in add_key

Insu Yun (1):
  keys: Be more consistent in selection of union members used

Paul Gortmaker (1):
  certs: add .gitignore to stop git nagging about x509_certificate_list

 Documentation/crypto/asymmetric-keys.txt |  27 +++--
 Documentation/security/keys.txt  |  41 ---
 certs/.gitignore |   4 +
 crypto/asymmetric_keys/asymmetric_keys.h |   5 -
 crypto/asymmetric_keys/asymmetric_type.c |  44 ---
 crypto/asymmetric_keys/public_key.c  |   4 +-
 crypto/asymmetric_keys/signature.c   |   2 +-
 crypto/asymmetric_keys/x509_parser.h |   1 +
 crypto/asymmetric_keys/x509_public_key.c |   9 +-
 fs/cifs/cifs_spnego.c|   6 +-
 fs/cifs/cifsacl.c|  25 ++--
 fs/cifs/connect.c|   9 +-
 fs/cifs/sess.c   |   2 +-
 fs/cifs/smb2pdu.c|   2 +-
 fs/ecryptfs/ecryptfs_kernel.h|   5 +-
 fs/ext4/crypto_key.c |   4 +-
 fs/f2fs/crypto_key.c |   4 +-
 fs/fscache/object-list.c |   4 +-
 fs/nfs/nfs4idmap.c   |   4 +-
 include/crypto/public_key.h  |   1 -
 include/keys/asymmetric-subtype.h|   2 +-
 include/keys/asymmetric-type.h   |  15 +++
 include/keys/user-type.h |   8 ++
 include/linux/key-type.h |   3 +-
 include/linux/key.h  |  33 +++---
 kernel/.gitignore|   1 -
 kernel/module_signing.c  |   1 +
 lib/digsig.c |   7 +-
 net/ceph/ceph_common.c   |   2 +-
 net/ceph/crypto.c|   6 +-
 net/dns_resolver/dns_key.c   |  20 ++--
 net/dns_resolver/dns_query.c |   7 +-
 net/dns_resolver/internal.h  |   8 ++
 net/rxrpc/af_rxrpc.c |   2 +-
 net/rxrpc/ar-key.c   |  32 ++---
 net/rxrpc/ar-output.c|   2 +-
 net/rxrpc/ar-security.c  |   4 +-
 net/rxrpc/rxkad.c|  16 +--
 scripts/extract-module-sig.pl| 136 +
 scripts/extract-sys-certs.pl | 144 +++
 security/integrity/evm/evm_crypto.c  |   2 +-
 security/keys/big_key.c  |  47 +---
 security/keys/encrypted-keys/encrypted.c |  18 +--
 security/keys/encrypted-keys/encrypted.h |   4 +-
 security/keys/encrypted-keys/masterkey_trusted.c |   4 +-
 security/keys/key.c  |  20 ++--
 security/keys/keyctl.c   |  12 +-
 security/keys/keyring.c  |  12 +-
 security/keys/process_keys.c |   4 +-
 security/keys/request_key.c  |   4 +-
 security/keys/request_key_auth.c |  12 +-
 security/keys/trusted.c  |   6 +-
 security/keys/user_defined.c |  14 +--
 53 files changed, 572 insertions(+), 239 deletions(-)
 create mode 100644 certs/.gitignore
 create mode 100755 scripts/extract-module-sig.pl
 create mode 100755 scripts/extract-sys-certs.pl
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.o

[PATCH 05/10] KEYS: Add identifier pointers to public_key_signature struct

2015-10-21 Thread David Howells

Add key identifier pointers to public_key_signature struct so that they can
be used to retain the identifier of the key to be used to verify the
signature in both PKCS#7 and X.509.

Signed-off-by: David Howells 
---

 crypto/asymmetric_keys/public_key.c |2 ++
 include/crypto/public_key.h |1 +
 2 files changed, 3 insertions(+)

diff --git a/crypto/asymmetric_keys/public_key.c 
b/crypto/asymmetric_keys/public_key.c
index e537aaeafdbf..f5b4824b7c77 100644
--- a/crypto/asymmetric_keys/public_key.c
+++ b/crypto/asymmetric_keys/public_key.c
@@ -72,6 +72,8 @@ void public_key_free(struct public_key *key,
}
 
if (sig) {
+   for (i = 0; i < ARRAY_SIZE(sig->auth_ids); i++)
+   kfree(sig->auth_ids[i]);
for (i = 0; i < ARRAY_SIZE(sig->mpi); i++)
mpi_free(sig->mpi[i]);
kfree(sig->digest);
diff --git a/include/crypto/public_key.h b/include/crypto/public_key.h
index a3f8f8268e23..ed86bfb23e89 100644
--- a/include/crypto/public_key.h
+++ b/include/crypto/public_key.h
@@ -76,6 +76,7 @@ struct public_key {
  * Public key cryptography signature data
  */
 struct public_key_signature {
+   struct asymmetric_key_id *auth_ids[2];
u8 *digest;
u8 digest_size; /* Number of bytes in digest */
u8 nr_mpi;  /* Occupancy of mpi[] */

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 06/10] X.509: Retain the key verification data

2015-10-21 Thread David Howells

Retain the key verification data (ie. the struct public_key_signature)
including the digest and the key identifiers.

Note that this means that we need to take a separate copy of the digest in
x509_get_sig_params() rather than lumping it in with the crypto layer data.

Signed-off-by: David Howells 
---

 crypto/asymmetric_keys/pkcs7_trust.c  |8 ++-
 crypto/asymmetric_keys/pkcs7_verify.c |   20 +
 crypto/asymmetric_keys/x509_cert_parser.c |   43 +-
 crypto/asymmetric_keys/x509_parser.h  |4 --
 crypto/asymmetric_keys/x509_public_key.c  |   68 +++--
 5 files changed, 72 insertions(+), 71 deletions(-)

diff --git a/crypto/asymmetric_keys/pkcs7_trust.c 
b/crypto/asymmetric_keys/pkcs7_trust.c
index 388007fed3b2..7bb9389fd644 100644
--- a/crypto/asymmetric_keys/pkcs7_trust.c
+++ b/crypto/asymmetric_keys/pkcs7_trust.c
@@ -77,16 +77,16 @@ static int pkcs7_validate_trust_one(struct pkcs7_message 
*pkcs7,
 
might_sleep();
last = x509;
-   sig = &last->sig;
+   sig = last->sig;
}
 
/* No match - see if the root certificate has a signer amongst the
 * trusted keys.
 */
-   if (last && (last->akid_id || last->akid_skid)) {
+   if (last && (last->sig->auth_ids[0] || last->sig->auth_ids[1])) {
key = x509_request_asymmetric_key(trust_keyring,
- last->akid_id,
- last->akid_skid,
+ last->sig->auth_ids[0],
+ last->sig->auth_ids[1],
  false);
if (!IS_ERR(key)) {
x509 = last;
diff --git a/crypto/asymmetric_keys/pkcs7_verify.c 
b/crypto/asymmetric_keys/pkcs7_verify.c
index d20c0b4b880e..e225dccdf559 100644
--- a/crypto/asymmetric_keys/pkcs7_verify.c
+++ b/crypto/asymmetric_keys/pkcs7_verify.c
@@ -175,6 +175,7 @@ static int pkcs7_find_key(struct pkcs7_message *pkcs7,
 static int pkcs7_verify_sig_chain(struct pkcs7_message *pkcs7,
  struct pkcs7_signed_info *sinfo)
 {
+   struct public_key_signature *sig;
struct x509_certificate *x509 = sinfo->signer, *p;
struct asymmetric_key_id *auth;
int ret;
@@ -194,14 +195,15 @@ static int pkcs7_verify_sig_chain(struct pkcs7_message 
*pkcs7,
goto maybe_missing_crypto_in_x509;
 
pr_debug("- issuer %s\n", x509->issuer);
-   if (x509->akid_id)
+   sig = x509->sig;
+   if (sig->auth_ids[0])
pr_debug("- authkeyid.id %*phN\n",
-x509->akid_id->len, x509->akid_id->data);
-   if (x509->akid_skid)
+sig->auth_ids[0]->len, sig->auth_ids[0]->data);
+   if (sig->auth_ids[1])
pr_debug("- authkeyid.skid %*phN\n",
-x509->akid_skid->len, x509->akid_skid->data);
+sig->auth_ids[1]->len, sig->auth_ids[1]->data);
 
-   if ((!x509->akid_id && !x509->akid_skid) ||
+   if ((!x509->sig->auth_ids[0] && !x509->sig->auth_ids[1]) ||
strcmp(x509->subject, x509->issuer) == 0) {
/* If there's no authority certificate specified, then
 * the certificate must be self-signed and is the root
@@ -225,7 +227,7 @@ static int pkcs7_verify_sig_chain(struct pkcs7_message 
*pkcs7,
/* Look through the X.509 certificates in the PKCS#7 message's
 * list to see if the next one is there.
 */
-   auth = x509->akid_id;
+   auth = sig->auth_ids[0];
if (auth) {
pr_debug("- want %*phN\n", auth->len, auth->data);
for (p = pkcs7->certs; p; p = p->next) {
@@ -235,7 +237,7 @@ static int pkcs7_verify_sig_chain(struct pkcs7_message 
*pkcs7,
goto found_issuer_check_skid;
}
} else {
-   auth = x509->akid_skid;
+   auth = sig->auth_ids[1];
pr_debug("- want %*phN\n", auth->len, auth->data);
for (p = pkcs7->certs; p; p = p->next) {
if (!p->skid)
@@ -255,8 +257,8 @@ static int pkcs7_verify_sig_chain(struct pkcs7_message 
*pkcs7,
/* We matched issuer + seri

[PATCH 03/10] KEYS: Add facility to check key trustworthiness upon link creation

2015-10-21 Thread David Howells

Add a facility whereby if KEY_FLAG_TRUSTED_ONLY is set on the destination
keyring, the creation of a link to a candidate key will cause the
trustworthiness of that key to be evaluated against the already present
contents of that keyring.  This affects operations like add_key(),
KEYCTL_LINK and KEYCTL_INSTANTIATE.

To this end:

 (1) A new key type method is provided:

int (*verify_trust)(const union key_payload *payload,
struct key *keyring);

 This is implemented by key types for which verification of one key by
 another is appropriate.  It is primarily intended for use with the
 asymmetric key type.

 When called, it is given the payload or prospective payload[*] of the
 candidate key to verify and a pointer to the destination keyring.  The
 method is expected to search the keying for an appropriate key with
 which to verify the candidate.

 [*] If called during add_key(), preparse is called before this method,
 but a key isn't actually allocated unless the verification is
 successful.

 (2) KEY_FLAG_TRUSTED is removed.  A key is now trusted by virtue of being
 contained in the trusted-only keyring being searched.

 (3) KEY_ALLOC_TRUSTED now acts as an override.  If this is passed to
 key_create_or_update() then the ->verify_trust() method will be
 ignored and the key will be added anyway.

Signed-off-by: David Howells 
---

 Documentation/security/keys.txt  |   17 
 crypto/asymmetric_keys/x509_public_key.c |6 ++--
 include/linux/key-type.h |   10 ++-
 include/linux/key.h  |   12 +---
 security/keys/key.c  |   44 --
 security/keys/keyring.c  |   18 +++-
 6 files changed, 87 insertions(+), 20 deletions(-)

diff --git a/Documentation/security/keys.txt b/Documentation/security/keys.txt
index 8c183873b2b7..e7f3447ccd1b 100644
--- a/Documentation/security/keys.txt
+++ b/Documentation/security/keys.txt
@@ -1183,6 +1183,23 @@ The structure has a number of fields, some of which are 
mandatory:
  successfully, even if instantiate() or update() succeed.
 
 
+ (*) int (*verify_trust)(const union key_payload *payload, struct key 
*keyring);
+
+ If the keyring to which a candidate key is being added/linked is marked as
+ KEY_FLAG_TRUSTED_ONLY then this function will get called in the candidate
+ key type to verify the key or proposed key based on its payload.  It is
+ expected to use the contents of the supplied destination keyring to
+ determine whether the candidate key is to be trusted and added to the
+ keyring.
+
+ The method should return 0 to allow the addition and an error otherwise,
+ typically ENOKEY if there's no key in the keyring to verify this key and
+ EKEYREJECTED if the selected key fails to verify the candidate.
+
+ This method is optional.  If it is not supplied, keys of this type cannot
+ be added to trusted-only keyrings and EPERM will be returned.
+
+
  (*) int (*instantiate)(struct key *key, struct key_preparsed_payload *prep);
 
  This method is called to attach a payload to a key during construction.
diff --git a/crypto/asymmetric_keys/x509_public_key.c 
b/crypto/asymmetric_keys/x509_public_key.c
index 64d42981a8d7..76c211b31da7 100644
--- a/crypto/asymmetric_keys/x509_public_key.c
+++ b/crypto/asymmetric_keys/x509_public_key.c
@@ -318,10 +318,10 @@ static int x509_key_preparse(struct key_preparsed_payload 
*prep)
ret = x509_check_signature(cert->pub, cert); /* self-signed */
if (ret < 0)
goto error_free_cert;
-   } else if (!prep->trusted) {
+   } else {
ret = x509_validate_trust(cert, get_system_trusted_keyring());
-   if (!ret)
-   prep->trusted = 1;
+   if (ret == -EKEYREJECTED)
+   goto error_free_cert;
}
 
/* Propose a description */
diff --git a/include/linux/key-type.h b/include/linux/key-type.h
index 7463355a198b..5d7cf5e7f8c6 100644
--- a/include/linux/key-type.h
+++ b/include/linux/key-type.h
@@ -45,7 +45,6 @@ struct key_preparsed_payload {
size_t  datalen;/* Raw datalen */
size_t  quotalen;   /* Quota length for proposed payload */
time_t  expiry; /* Expiry time of key */
-   booltrusted;/* True if key is trusted */
 };
 
 typedef int (*request_key_actor_t)(struct key_construction *key,
@@ -95,6 +94,15 @@ struct key_type {
 */
void (*free_preparse)(struct key_preparsed_payload *prep);
 
+   /* Verify the trust on a key when added to a trusted-only keyring.
+*
+* If this method isn't provided then it is assumed that the concept of
+* trust is irrelevant to keys of this

[PATCH 02/10] PKCS#7: Make trust determination dependent on contents of trust keyring

2015-10-21 Thread David Howells

Make the determination of the trustworthiness of a key dependent on whether
a key that can verify it is present in the ring of trusted keys rather than
whether or not the verifying key has KEY_FLAG_TRUSTED set.

Signed-off-by: David Howells 
---

 certs/system_keyring.c  |   13 -
 crypto/asymmetric_keys/pkcs7_key_type.c |2 +-
 crypto/asymmetric_keys/pkcs7_parser.h   |1 -
 crypto/asymmetric_keys/pkcs7_trust.c|   16 +++-
 crypto/asymmetric_keys/verify_pefile.c  |2 +-
 crypto/asymmetric_keys/x509_parser.h|1 -
 include/crypto/pkcs7.h  |3 +--
 include/linux/verification.h|1 -
 kernel/module_signing.c |2 +-
 9 files changed, 11 insertions(+), 30 deletions(-)

diff --git a/certs/system_keyring.c b/certs/system_keyring.c
index cf55bd3a072a..e7f286413276 100644
--- a/certs/system_keyring.c
+++ b/certs/system_keyring.c
@@ -121,7 +121,6 @@ late_initcall(load_system_certificate_list);
 int verify_pkcs7_signature(const void *data, size_t len,
   const void *raw_pkcs7, size_t pkcs7_len,
   struct key *trusted_keys,
-  int untrusted_error,
   enum key_being_used_for usage,
   int (*view_content)(void *ctx,
   const void *data, size_t len,
@@ -129,7 +128,6 @@ int verify_pkcs7_signature(const void *data, size_t len,
   void *ctx)
 {
struct pkcs7_message *pkcs7;
-   bool trusted;
int ret;
 
pkcs7 = pkcs7_parse_message(raw_pkcs7, pkcs7_len);
@@ -149,13 +147,10 @@ int verify_pkcs7_signature(const void *data, size_t len,
 
if (!trusted_keys)
trusted_keys = system_trusted_keyring;
-   ret = pkcs7_validate_trust(pkcs7, trusted_keys, &trusted);
-   if (ret < 0)
-   goto error;
-
-   if (!trusted && untrusted_error) {
-   pr_err("PKCS#7 signature not signed with a trusted key\n");
-   ret = untrusted_error;
+   ret = pkcs7_validate_trust(pkcs7, trusted_keys);
+   if (ret < 0) {
+   if (ret == -ENOKEY)
+   pr_err("PKCS#7 signature not signed with a trusted 
key\n");
goto error;
}
 
diff --git a/crypto/asymmetric_keys/pkcs7_key_type.c 
b/crypto/asymmetric_keys/pkcs7_key_type.c
index 240a5303ebb7..89b75477868d 100644
--- a/crypto/asymmetric_keys/pkcs7_key_type.c
+++ b/crypto/asymmetric_keys/pkcs7_key_type.c
@@ -71,7 +71,7 @@ static int pkcs7_preparse(struct key_preparsed_payload *prep)
 
ret = verify_pkcs7_signature(NULL, 0,
 prep->data, prep->datalen,
-NULL, -ENOKEY, usage,
+NULL, usage,
 pkcs7_view_content, prep);
 
kleave(" = %d", ret);
diff --git a/crypto/asymmetric_keys/pkcs7_parser.h 
b/crypto/asymmetric_keys/pkcs7_parser.h
index a66b19ebcf47..c8159983ed8f 100644
--- a/crypto/asymmetric_keys/pkcs7_parser.h
+++ b/crypto/asymmetric_keys/pkcs7_parser.h
@@ -22,7 +22,6 @@ struct pkcs7_signed_info {
struct pkcs7_signed_info *next;
struct x509_certificate *signer; /* Signing certificate (in msg->certs) 
*/
unsignedindex;
-   booltrusted;
boolunsupported_crypto; /* T if not usable due to 
missing crypto */
 
/* Message digest - the digest of the Content Data (or NULL) */
diff --git a/crypto/asymmetric_keys/pkcs7_trust.c 
b/crypto/asymmetric_keys/pkcs7_trust.c
index 90d6d47965b0..388007fed3b2 100644
--- a/crypto/asymmetric_keys/pkcs7_trust.c
+++ b/crypto/asymmetric_keys/pkcs7_trust.c
@@ -30,7 +30,6 @@ static int pkcs7_validate_trust_one(struct pkcs7_message 
*pkcs7,
struct public_key_signature *sig = &sinfo->sig;
struct x509_certificate *x509, *last = NULL, *p;
struct key *key;
-   bool trusted;
int ret;
 
kenter(",%u,", sinfo->index);
@@ -42,10 +41,8 @@ static int pkcs7_validate_trust_one(struct pkcs7_message 
*pkcs7,
 
for (x509 = sinfo->signer; x509; x509 = x509->signer) {
if (x509->seen) {
-   if (x509->verified) {
-   trusted = x509->trusted;
+   if (x509->verified)
goto verified;
-   }
kleave(" = -ENOKEY [cached]");
return -ENOKEY;
}
@@ -122,7 +119,6 @@ static int pkcs7_validate_trust_one(struct pkcs7_message 
*pkcs7,
 
 matched:
ret = verify_signature(key, sig);
-   trusted = test_bit(KEY_FLAG_TRUSTED, &key->flags);
key_put(key);

[PATCH 01/10] KEYS: Generalise system_verify_data() to provide access to internal content

2015-10-21 Thread David Howells

Generalise system_verify_data() to provide access to internal content
through a callback.  This allows all the PKCS#7 stuff to be hidden inside
this function and removed from the PE file parser and the PKCS#7 test key.

If external content is not required, NULL should be passed as data to the
function.  If the callback is not required, that can be set to NULL.

The function is now called verify_pkcs7_signature() to contrast with
verify_pefile_signature() and the definitions of both have been moved into
linux/verification.h along with the key_being_used_for enum.

Signed-off-by: David Howells 
---

 arch/x86/kernel/kexec-bzimage64.c   |   18 ++---
 certs/system_keyring.c  |   45 +-
 crypto/asymmetric_keys/Kconfig  |1 
 crypto/asymmetric_keys/mscode_parser.c  |   21 +++---
 crypto/asymmetric_keys/pkcs7_key_type.c |   64 +++
 crypto/asymmetric_keys/pkcs7_parser.c   |   21 +-
 crypto/asymmetric_keys/verify_pefile.c  |   40 ---
 crypto/asymmetric_keys/verify_pefile.h  |5 +-
 include/crypto/pkcs7.h  |3 +
 include/crypto/public_key.h |   14 ---
 include/keys/asymmetric-type.h  |1 
 include/keys/system_keyring.h   |7 ---
 include/linux/verification.h|   50 
 include/linux/verify_pefile.h   |   22 ---
 kernel/module_signing.c |5 +-
 15 files changed, 156 insertions(+), 161 deletions(-)
 create mode 100644 include/linux/verification.h
 delete mode 100644 include/linux/verify_pefile.h

diff --git a/arch/x86/kernel/kexec-bzimage64.c 
b/arch/x86/kernel/kexec-bzimage64.c
index 0f8a6bbaaa44..0b5da62eb203 100644
--- a/arch/x86/kernel/kexec-bzimage64.c
+++ b/arch/x86/kernel/kexec-bzimage64.c
@@ -19,8 +19,7 @@
 #include 
 #include 
 #include 
-#include 
-#include 
+#include 
 
 #include 
 #include 
@@ -529,18 +528,9 @@ static int bzImage64_cleanup(void *loader_data)
 #ifdef CONFIG_KEXEC_BZIMAGE_VERIFY_SIG
 static int bzImage64_verify_sig(const char *kernel, unsigned long kernel_len)
 {
-   bool trusted;
-   int ret;
-
-   ret = verify_pefile_signature(kernel, kernel_len,
- system_trusted_keyring,
- VERIFYING_KEXEC_PE_SIGNATURE,
- &trusted);
-   if (ret < 0)
-   return ret;
-   if (!trusted)
-   return -EKEYREJECTED;
-   return 0;
+   return verify_pefile_signature(kernel, kernel_len,
+  NULL,
+  VERIFYING_KEXEC_PE_SIGNATURE);
 }
 #endif
 
diff --git a/certs/system_keyring.c b/certs/system_keyring.c
index 2570598b784d..cf55bd3a072a 100644
--- a/certs/system_keyring.c
+++ b/certs/system_keyring.c
@@ -108,16 +108,25 @@ late_initcall(load_system_certificate_list);
 #ifdef CONFIG_SYSTEM_DATA_VERIFICATION
 
 /**
- * Verify a PKCS#7-based signature on system data.
- * @data: The data to be verified.
+ * verify_pkcs7_signature - Verify a PKCS#7-based signature on system data.
+ * @data: The data to be verified (NULL if expecting internal data).
  * @len: Size of @data.
  * @raw_pkcs7: The PKCS#7 message that is the signature.
  * @pkcs7_len: The size of @raw_pkcs7.
+ * @trusted_keys: Trusted keys to use (NULL for system_trusted_keyring).
  * @usage: The use to which the key is being put.
+ * @view_content: Callback to gain access to content.
+ * @ctx: Context for callback.
  */
-int system_verify_data(const void *data, unsigned long len,
-  const void *raw_pkcs7, size_t pkcs7_len,
-  enum key_being_used_for usage)
+int verify_pkcs7_signature(const void *data, size_t len,
+  const void *raw_pkcs7, size_t pkcs7_len,
+  struct key *trusted_keys,
+  int untrusted_error,
+  enum key_being_used_for usage,
+  int (*view_content)(void *ctx,
+  const void *data, size_t len,
+  size_t asn1hdrlen),
+  void *ctx)
 {
struct pkcs7_message *pkcs7;
bool trusted;
@@ -128,7 +137,7 @@ int system_verify_data(const void *data, unsigned long len,
return PTR_ERR(pkcs7);
 
/* The data should be detached - so we need to supply it. */
-   if (pkcs7_supply_detached_data(pkcs7, data, len) < 0) {
+   if (data && pkcs7_supply_detached_data(pkcs7, data, len) < 0) {
pr_err("PKCS#7 signature with non-detached data\n");
ret = -EBADMSG;
goto error;
@@ -138,13 +147,29 @@ int system_verify_data(const void *data, unsigned long 
len,
if (ret < 0)
goto error;
 
-

[PATCH 04/10] KEYS: Allow authentication data to be stored in an asymmetric key

2015-10-21 Thread David Howells

Allow authentication data to be stored in an asymmetric key in the 4th
element of the key payload and provide a way for it to be destroyed.

For the public key subtype, this will be a public_key_signature struct.

Signed-off-by: David Howells 
---

 crypto/asymmetric_keys/asymmetric_type.c  |7 +--
 crypto/asymmetric_keys/public_key.c   |   22 +++---
 crypto/asymmetric_keys/x509_cert_parser.c |2 +-
 include/crypto/public_key.h   |5 +++--
 include/keys/asymmetric-subtype.h |2 +-
 include/keys/asymmetric-type.h|7 ---
 6 files changed, 33 insertions(+), 12 deletions(-)

diff --git a/crypto/asymmetric_keys/asymmetric_type.c 
b/crypto/asymmetric_keys/asymmetric_type.c
index 9f2165b27d52..a79d30128821 100644
--- a/crypto/asymmetric_keys/asymmetric_type.c
+++ b/crypto/asymmetric_keys/asymmetric_type.c
@@ -331,7 +331,8 @@ static void asymmetric_key_free_preparse(struct 
key_preparsed_payload *prep)
pr_devel("==>%s()\n", __func__);
 
if (subtype) {
-   subtype->destroy(prep->payload.data[asym_crypto]);
+   subtype->destroy(prep->payload.data[asym_crypto],
+prep->payload.data[asym_auth]);
module_put(subtype->owner);
}
asymmetric_key_free_kids(kids);
@@ -346,13 +347,15 @@ static void asymmetric_key_destroy(struct key *key)
struct asymmetric_key_subtype *subtype = asymmetric_key_subtype(key);
struct asymmetric_key_ids *kids = key->payload.data[asym_key_ids];
void *data = key->payload.data[asym_crypto];
+   void *auth = key->payload.data[asym_auth];
 
key->payload.data[asym_crypto] = NULL;
key->payload.data[asym_subtype] = NULL;
key->payload.data[asym_key_ids] = NULL;
+   key->payload.data[asym_auth] = NULL;
 
if (subtype) {
-   subtype->destroy(data);
+   subtype->destroy(data, auth);
module_put(subtype->owner);
}
 
diff --git a/crypto/asymmetric_keys/public_key.c 
b/crypto/asymmetric_keys/public_key.c
index 6db4c01c6503..e537aaeafdbf 100644
--- a/crypto/asymmetric_keys/public_key.c
+++ b/crypto/asymmetric_keys/public_key.c
@@ -59,18 +59,34 @@ static void public_key_describe(const struct key 
*asymmetric_key,
 /*
  * Destroy a public key algorithm key.
  */
-void public_key_destroy(void *payload)
+void public_key_free(struct public_key *key,
+struct public_key_signature *sig)
 {
-   struct public_key *key = payload;
int i;
 
if (key) {
for (i = 0; i < ARRAY_SIZE(key->mpi); i++)
mpi_free(key->mpi[i]);
kfree(key);
+   key = NULL;
}
+
+   if (sig) {
+   for (i = 0; i < ARRAY_SIZE(sig->mpi); i++)
+   mpi_free(sig->mpi[i]);
+   kfree(sig->digest);
+   kfree(sig);
+   }
+}
+EXPORT_SYMBOL_GPL(public_key_free);
+
+/*
+ * Destroy a public key algorithm key.
+ */
+static void public_key_destroy(void *payload0, void *payload3)
+{
+   public_key_free(payload0, payload3);
 }
-EXPORT_SYMBOL_GPL(public_key_destroy);
 
 /*
  * Verify a signature using a public key.
diff --git a/crypto/asymmetric_keys/x509_cert_parser.c 
b/crypto/asymmetric_keys/x509_cert_parser.c
index af71878dc15b..430848445dd9 100644
--- a/crypto/asymmetric_keys/x509_cert_parser.c
+++ b/crypto/asymmetric_keys/x509_cert_parser.c
@@ -48,7 +48,7 @@ struct x509_parse_context {
 void x509_free_certificate(struct x509_certificate *cert)
 {
if (cert) {
-   public_key_destroy(cert->pub);
+   public_key_free(cert->pub, NULL);
kfree(cert->issuer);
kfree(cert->subject);
kfree(cert->id);
diff --git a/include/crypto/public_key.h b/include/crypto/public_key.h
index de50d026576d..a3f8f8268e23 100644
--- a/include/crypto/public_key.h
+++ b/include/crypto/public_key.h
@@ -72,8 +72,6 @@ struct public_key {
};
 };
 
-extern void public_key_destroy(void *payload);
-
 /*
  * Public key cryptography signature data
  */
@@ -95,6 +93,9 @@ struct public_key_signature {
};
 };
 
+extern void public_key_free(struct public_key *key,
+   struct public_key_signature *sig);
+
 struct key;
 extern int verify_signature(const struct key *key,
const struct public_key_signature *sig);
diff --git a/include/keys/asymmetric-subtype.h 
b/include/keys/asymmetric-subtype.h
index 4915d40d3c3c..2480469ce8fb 100644
--- a/include/keys/asymmetric-subtype.h
+++ b/include/keys/asymmetric-subtype.h
@@ -32,7 +32,7 @@ struct asymmetric_key_subtype {
void (*describe)(const struct key *key, struct seq_file *m);
 
/* Destroy a key of this subtype */
-   void (*destroy)(void *payload);
+

[PATCH 00/10] KEYS: Change how keys are determined to be trusted

2015-10-21 Thread David Howells


Here's a set of patches that changes how keys are determined to be trusted
- currently, that's a case of whether a key has KEY_FLAG_TRUSTED set upon
it.  A keyring can then have a flag set (KEY_FLAG_TRUSTED ONLY) that
indicates that only keys with this flag set may be added to that keyring.

Further, any time an X.509 certificate is instantiated without this flag
set, the certificate is judged against the contents of the system trusted
keyring to determine whether KEY_FLAG_TRUSTED should be set upon it.

With these patches, KEY_FLAG_TRUSTED is removed.  The kernel may add
implicitly trusted keys to a trusted-only keyring by asserting
KEY_ALLOC_TRUSTED when the key is created, but otherwise the key will only
be allowed to be added to the keyring if it can be verified by a key
already in that keyring.  The system trusted keyring is not then special in
this sense and other trusted keyrings can be set up that are wholly
independent of it.

To make this work, we have to retain sufficient data from the X.509
certificate that we can then verify the signature at need.

The patches can be found here also:


http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=keys-trust

and are tagged with:

keys-trust-20151021

David
---
David Howells (10):
  KEYS: Generalise system_verify_data() to provide access to internal 
content
  PKCS#7: Make trust determination dependent on contents of trust keyring
  KEYS: Add facility to check key trustworthiness upon link creation
  KEYS: Allow authentication data to be stored in an asymmetric key
  KEYS: Add identifier pointers to public_key_signature struct
  X.509: Retain the key verification data
  X.509: Extract signature digest and make self-signed cert checks earlier
  PKCS#7: Make the signature a pointer rather than embedding it
  X.509: Move the trust validation code out to its own file
  KEYS: Move the point of trust determination to __key_link()


 Documentation/security/keys.txt   |   17 ++
 arch/x86/kernel/kexec-bzimage64.c |   18 --
 certs/system_keyring.c|   49 +++--
 crypto/asymmetric_keys/Kconfig|1 
 crypto/asymmetric_keys/Makefile   |4 
 crypto/asymmetric_keys/asymmetric_keys.h  |2 
 crypto/asymmetric_keys/asymmetric_type.c  |   22 ++
 crypto/asymmetric_keys/mscode_parser.c|   21 +-
 crypto/asymmetric_keys/pkcs7_key_type.c   |   64 +++---
 crypto/asymmetric_keys/pkcs7_parser.c |   59 +++--
 crypto/asymmetric_keys/pkcs7_parser.h |   11 -
 crypto/asymmetric_keys/pkcs7_trust.c  |   44 ++--
 crypto/asymmetric_keys/pkcs7_verify.c |  108 --
 crypto/asymmetric_keys/public_key.c   |   43 
 crypto/asymmetric_keys/public_key.h   |6 +
 crypto/asymmetric_keys/public_key_trust.c |  180 +
 crypto/asymmetric_keys/verify_pefile.c|   40 +---
 crypto/asymmetric_keys/verify_pefile.h|5 
 crypto/asymmetric_keys/x509_cert_parser.c |   53 +++--
 crypto/asymmetric_keys/x509_parser.h  |   12 -
 crypto/asymmetric_keys/x509_public_key.c  |  312 +
 include/crypto/pkcs7.h|6 -
 include/crypto/public_key.h   |   28 +--
 include/keys/asymmetric-subtype.h |6 -
 include/keys/asymmetric-type.h|8 -
 include/keys/system_keyring.h |7 -
 include/linux/key-type.h  |   10 +
 include/linux/key.h   |   12 +
 include/linux/verification.h  |   49 +
 include/linux/verify_pefile.h |   22 --
 kernel/module_signing.c   |5 
 security/integrity/digsig_asymmetric.c|5 
 security/keys/key.c   |   44 +++-
 security/keys/keyring.c   |   18 +-
 34 files changed, 735 insertions(+), 556 deletions(-)
 create mode 100644 crypto/asymmetric_keys/public_key_trust.c
 create mode 100644 include/linux/verification.h
 delete mode 100644 include/linux/verify_pefile.h

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 07/10] X.509: Extract signature digest and make self-signed cert checks earlier

2015-10-21 Thread David Howells

Extract the signature digest for an X.509 certificate earlier, at the end
of x509_cert_parse() rather than leaving it to the callers thereof.

Further, immediately after that, check the signature on self-signed
certificates, also rather in the callers of x509_cert_parse().

This we need to determine whether or not the X.509 cert requires crypto
that we don't support before we do the above two steps.

We note in the x509_certificate struct the following bits of information:

 (1) Whether the signature is self-signed (even if we can't check the
 signature due to missing crypto).

 (2) Whether the key held in the certificate needs unsupported crypto to be
 used.  We may get a PKCS#7 message with X.509 certs that we can't make
 use of - we just ignore them and give ENOPKG at the end it we couldn't
 verify anything if at least one of these unusable certs are in the
 chain of trust.

 (3) Whether the signature held in the certificate needs unsupported crypto
 to be checked.  We can still use the key held in this certificate,
 even if we can't check the signature on it - if it is held in the
 system trusted keyring, for instance.  We just can't add it to a ring
 of trusted keys or follow it further up the chain of trust.

Making these checks earlier allows x509_check_signature() to be removed and
replaced with direct calls to public_key_verify_signature().

Signed-off-by: David Howells 
---

 crypto/asymmetric_keys/pkcs7_verify.c |   38 ++--
 crypto/asymmetric_keys/x509_cert_parser.c |   10 ++
 crypto/asymmetric_keys/x509_parser.h  |7 +
 crypto/asymmetric_keys/x509_public_key.c  |  139 -
 4 files changed, 121 insertions(+), 73 deletions(-)

diff --git a/crypto/asymmetric_keys/pkcs7_verify.c 
b/crypto/asymmetric_keys/pkcs7_verify.c
index e225dccdf559..1dede0199673 100644
--- a/crypto/asymmetric_keys/pkcs7_verify.c
+++ b/crypto/asymmetric_keys/pkcs7_verify.c
@@ -190,9 +190,8 @@ static int pkcs7_verify_sig_chain(struct pkcs7_message 
*pkcs7,
 x509->subject,
 x509->raw_serial_size, x509->raw_serial);
x509->seen = true;
-   ret = x509_get_sig_params(x509);
-   if (ret < 0)
-   goto maybe_missing_crypto_in_x509;
+   if (x509->unsupported_key)
+   goto unsupported_crypto_in_x509;
 
pr_debug("- issuer %s\n", x509->issuer);
sig = x509->sig;
@@ -203,22 +202,14 @@ static int pkcs7_verify_sig_chain(struct pkcs7_message 
*pkcs7,
pr_debug("- authkeyid.skid %*phN\n",
 sig->auth_ids[1]->len, sig->auth_ids[1]->data);
 
-   if ((!x509->sig->auth_ids[0] && !x509->sig->auth_ids[1]) ||
-   strcmp(x509->subject, x509->issuer) == 0) {
+   if (x509->self_signed) {
/* If there's no authority certificate specified, then
 * the certificate must be self-signed and is the root
 * of the chain.  Likewise if the cert is its own
 * authority.
 */
-   pr_debug("- no auth?\n");
-   if (x509->raw_subject_size != x509->raw_issuer_size ||
-   memcmp(x509->raw_subject, x509->raw_issuer,
-  x509->raw_issuer_size) != 0)
-   return 0;
-
-   ret = x509_check_signature(x509->pub, x509);
-   if (ret < 0)
-   goto maybe_missing_crypto_in_x509;
+   if (x509->unsupported_sig)
+   goto unsupported_crypto_in_x509;
x509->signer = x509;
pr_debug("- self-signed\n");
return 0;
@@ -270,7 +261,7 @@ static int pkcs7_verify_sig_chain(struct pkcs7_message 
*pkcs7,
sinfo->index);
return 0;
}
-   ret = x509_check_signature(p->pub, x509);
+   ret = public_key_verify_signature(p->pub, p->sig);
if (ret < 0)
return ret;
x509->signer = p;
@@ -282,16 +273,14 @@ static int pkcs7_verify_sig_chain(struct pkcs7_message 
*pkcs7,
might_sleep();
}
 
-maybe_missing_crypto_in_x509:
+unsupported_crypto_in_x509:
/* Just prune the certificate chain at this point if we lack some
 * crypto module to go further.  Note, however, we don't want to set
-* sinfo->missing_crypto as the signed info block may still be
+* sinfo->unsupp

[PATCH 08/10] PKCS#7: Make the signature a pointer rather than embedding it

2015-10-21 Thread David Howells

Point to the public_key_signature struct from the pkcs7_signed_info struct
rather than embedding it.  This makes it easier to have it take an
arbitrary number of MPIs in future.

We also save a copy of the digest in the signature without sharing the
memory with the crypto layer metadata.  This means we can use
public_key_free() to get rid of the signature record.

Signed-off-by: David Howells 
---

 crypto/asymmetric_keys/pkcs7_parser.c |   38 +++-
 crypto/asymmetric_keys/pkcs7_parser.h |   10 +++---
 crypto/asymmetric_keys/pkcs7_trust.c  |4 +--
 crypto/asymmetric_keys/pkcs7_verify.c |   52 +
 4 files changed, 56 insertions(+), 48 deletions(-)

diff --git a/crypto/asymmetric_keys/pkcs7_parser.c 
b/crypto/asymmetric_keys/pkcs7_parser.c
index 7b69783cff99..8454ae5b5aa8 100644
--- a/crypto/asymmetric_keys/pkcs7_parser.c
+++ b/crypto/asymmetric_keys/pkcs7_parser.c
@@ -44,9 +44,7 @@ struct pkcs7_parse_context {
 static void pkcs7_free_signed_info(struct pkcs7_signed_info *sinfo)
 {
if (sinfo) {
-   mpi_free(sinfo->sig.mpi[0]);
-   kfree(sinfo->sig.digest);
-   kfree(sinfo->signing_cert_id);
+   public_key_free(NULL, sinfo->sig);
kfree(sinfo);
}
 }
@@ -125,6 +123,10 @@ struct pkcs7_message *pkcs7_parse_message(const void 
*data, size_t datalen)
ctx->sinfo = kzalloc(sizeof(struct pkcs7_signed_info), GFP_KERNEL);
if (!ctx->sinfo)
goto out_no_sinfo;
+   ctx->sinfo->sig = kzalloc(sizeof(struct public_key_signature),
+ GFP_KERNEL);
+   if (!ctx->sinfo->sig)
+   goto out_no_sig;
 
ctx->data = (unsigned long)data;
ctx->ppcerts = &ctx->certs;
@@ -150,6 +152,7 @@ out:
ctx->certs = cert->next;
x509_free_certificate(cert);
}
+out_no_sig:
pkcs7_free_signed_info(ctx->sinfo);
 out_no_sinfo:
pkcs7_free_message(ctx->msg);
@@ -219,25 +222,25 @@ int pkcs7_sig_note_digest_algo(void *context, size_t 
hdrlen,
 
switch (ctx->last_oid) {
case OID_md4:
-   ctx->sinfo->sig.pkey_hash_algo = HASH_ALGO_MD4;
+   ctx->sinfo->sig->pkey_hash_algo = HASH_ALGO_MD4;
break;
case OID_md5:
-   ctx->sinfo->sig.pkey_hash_algo = HASH_ALGO_MD5;
+   ctx->sinfo->sig->pkey_hash_algo = HASH_ALGO_MD5;
break;
case OID_sha1:
-   ctx->sinfo->sig.pkey_hash_algo = HASH_ALGO_SHA1;
+   ctx->sinfo->sig->pkey_hash_algo = HASH_ALGO_SHA1;
break;
case OID_sha256:
-   ctx->sinfo->sig.pkey_hash_algo = HASH_ALGO_SHA256;
+   ctx->sinfo->sig->pkey_hash_algo = HASH_ALGO_SHA256;
break;
case OID_sha384:
-   ctx->sinfo->sig.pkey_hash_algo = HASH_ALGO_SHA384;
+   ctx->sinfo->sig->pkey_hash_algo = HASH_ALGO_SHA384;
break;
case OID_sha512:
-   ctx->sinfo->sig.pkey_hash_algo = HASH_ALGO_SHA512;
+   ctx->sinfo->sig->pkey_hash_algo = HASH_ALGO_SHA512;
break;
case OID_sha224:
-   ctx->sinfo->sig.pkey_hash_algo = HASH_ALGO_SHA224;
+   ctx->sinfo->sig->pkey_hash_algo = HASH_ALGO_SHA224;
default:
printk("Unsupported digest algo: %u\n", ctx->last_oid);
return -ENOPKG;
@@ -256,7 +259,7 @@ int pkcs7_sig_note_pkey_algo(void *context, size_t hdrlen,
 
switch (ctx->last_oid) {
case OID_rsaEncryption:
-   ctx->sinfo->sig.pkey_algo = PKEY_ALGO_RSA;
+   ctx->sinfo->sig->pkey_algo = PKEY_ALGO_RSA;
break;
default:
printk("Unsupported pkey algo: %u\n", ctx->last_oid);
@@ -617,16 +620,17 @@ int pkcs7_sig_note_signature(void *context, size_t hdrlen,
 const void *value, size_t vlen)
 {
struct pkcs7_parse_context *ctx = context;
+   struct public_key_signature *sig = ctx->sinfo->sig;
MPI mpi;
 
-   BUG_ON(ctx->sinfo->sig.pkey_algo != PKEY_ALGO_RSA);
+   BUG_ON(sig->pkey_algo != PKEY_ALGO_RSA);
 
mpi = mpi_read_raw_data(value, vlen);
if (!mpi)
return -ENOMEM;
 
-   ctx->sinfo->sig.mpi[0] = mpi;
-   ctx->sinfo->sig.nr_mpi = 1;
+   sig->mpi[0] = mpi;
+   sig->nr_mpi = 1;
return 0;
 }
 
@@ -662,12 +666,16 @@ int pkcs7_note_signed_info(void *context, size_t hdrlen,
 
pr_devel("SINFO KID: %u [%*phN]\n", kid->len, kid->len, kid->data);
 
-   sinfo->signing_cert_id = kid;
+   sinfo->sig->auth_i

[PATCH 09/10] X.509: Move the trust validation code out to its own file

2015-10-21 Thread David Howells

Move the X.509 trust validation code out to its own file so that it can be
generalised.

Signed-off-by: David Howells 
---

 crypto/asymmetric_keys/Makefile   |2 
 crypto/asymmetric_keys/public_key_trust.c |  192 +
 crypto/asymmetric_keys/x509_parser.h  |6 +
 crypto/asymmetric_keys/x509_public_key.c  |  167 -
 4 files changed, 199 insertions(+), 168 deletions(-)
 create mode 100644 crypto/asymmetric_keys/public_key_trust.c

diff --git a/crypto/asymmetric_keys/Makefile b/crypto/asymmetric_keys/Makefile
index cd1406f9b14a..bd07987c64e7 100644
--- a/crypto/asymmetric_keys/Makefile
+++ b/crypto/asymmetric_keys/Makefile
@@ -12,7 +12,7 @@ obj-$(CONFIG_PUBLIC_KEY_ALGO_RSA) += rsa.o
 #
 # X.509 Certificate handling
 #
-obj-$(CONFIG_X509_CERTIFICATE_PARSER) += x509_key_parser.o
+obj-$(CONFIG_X509_CERTIFICATE_PARSER) += x509_key_parser.o public_key_trust.o
 x509_key_parser-y := \
x509-asn1.o \
x509_akid-asn1.o \
diff --git a/crypto/asymmetric_keys/public_key_trust.c 
b/crypto/asymmetric_keys/public_key_trust.c
new file mode 100644
index ..753a413d479b
--- /dev/null
+++ b/crypto/asymmetric_keys/public_key_trust.c
@@ -0,0 +1,192 @@
+/* Instantiate a public key crypto key from an X.509 Certificate
+ *
+ * Copyright (C) 2012 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowe...@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public Licence
+ * as published by the Free Software Foundation; either version
+ * 2 of the Licence, or (at your option) any later version.
+ */
+
+#define pr_fmt(fmt) "X.509: "fmt
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "asymmetric_keys.h"
+#include "public_key.h"
+#include "x509_parser.h"
+
+static bool use_builtin_keys;
+static struct asymmetric_key_id *ca_keyid;
+
+#ifndef MODULE
+static struct {
+   struct asymmetric_key_id id;
+   unsigned char data[10];
+} cakey;
+
+static int __init ca_keys_setup(char *str)
+{
+   if (!str)   /* default system keyring */
+   return 1;
+
+   if (strncmp(str, "id:", 3) == 0) {
+   struct asymmetric_key_id *p = &cakey.id;
+   size_t hexlen = (strlen(str) - 3) / 2;
+   int ret;
+
+   if (hexlen == 0 || hexlen > sizeof(cakey.data)) {
+   pr_err("Missing or invalid ca_keys id\n");
+   return 1;
+   }
+
+   ret = __asymmetric_key_hex_to_key_id(str + 3, p, hexlen);
+   if (ret < 0)
+   pr_err("Unparsable ca_keys id hex string\n");
+   else
+   ca_keyid = p;   /* owner key 'id:xx' */
+   } else if (strcmp(str, "builtin") == 0) {
+   use_builtin_keys = true;
+   }
+
+   return 1;
+}
+__setup("ca_keys=", ca_keys_setup);
+#endif
+
+/**
+ * x509_request_asymmetric_key - Request a key by X.509 certificate params.
+ * @keyring: The keys to search.
+ * @id: The issuer & serialNumber to look for or NULL.
+ * @skid: The subjectKeyIdentifier to look for or NULL.
+ * @partial: Use partial match if true, exact if false.
+ *
+ * Find a key in the given keyring by identifier.  The preferred identifier is
+ * the issuer + serialNumber and the fallback identifier is the
+ * subjectKeyIdentifier.  If both are given, the lookup is by the former, but
+ * the latter must also match.
+ */
+struct key *x509_request_asymmetric_key(struct key *keyring,
+   const struct asymmetric_key_id *id,
+   const struct asymmetric_key_id *skid,
+   bool partial)
+{
+   struct key *key;
+   key_ref_t ref;
+   const char *lookup;
+   char *req, *p;
+   int len;
+
+   if (id) {
+   lookup = id->data;
+   len = id->len;
+   } else {
+   lookup = skid->data;
+   len = skid->len;
+   }
+
+   /* Construct an identifier "id:". */
+   p = req = kmalloc(2 + 1 + len * 2 + 1, GFP_KERNEL);
+   if (!req)
+   return ERR_PTR(-ENOMEM);
+
+   if (partial) {
+   *p++ = 'i';
+   *p++ = 'd';
+   } else {
+   *p++ = 'e';
+   *p++ = 'x';
+   }
+   *p++ = ':';
+   p = bin2hex(p, lookup, len);
+   *p = 0;
+
+   pr_debug("Look up: \"%s\"\n", req);
+
+   ref = keyring_search(make_key_ref(keyring, 1),
+&key_type_asymmetric, req);
+   if (IS_ERR(ref))
+   pr_debug("R

[PATCH 10/10] KEYS: Move the point of trust determination to __key_link()

2015-10-21 Thread David Howells

Move the point at which a key is determined to be trustworthy to
__key_link() so that we use the contents of the keyring being linked in to
to determine whether the key being linked in is trusted or not.

What is 'trusted' then becomes a matter of what's in the keyring.

Currently, the test is done when the key is parsed, but given that at that
point we can only sensibly refer to the contents of the system trusted
keyring, we can only use that as the basis for working out the
trustworthiness of a new key.

With this change, a trusted keyring is a set of keys that once the
trusted-only flag is set cannot be added to except by verification through
one of the contained keys.

Further, adding a key into a trusted keyring, whilst it might grant
trustworthiness in the context of that keyring, does not automatically
grant trustworthiness in the context of a second keyring to which it could
be secondarily linked.

To accomplish this, the authentication data associated with the key source
must now be retained.  For an X.509 cert, this means the contents of the
AuthorityKeyIdentifier and the signature data.

Signed-off-by: David Howells 
---

 certs/system_keyring.c|3 +
 crypto/asymmetric_keys/Makefile   |2 -
 crypto/asymmetric_keys/asymmetric_keys.h  |2 +
 crypto/asymmetric_keys/asymmetric_type.c  |   15 +
 crypto/asymmetric_keys/pkcs7_trust.c  |   22 +++
 crypto/asymmetric_keys/public_key.c   |   19 ++
 crypto/asymmetric_keys/public_key.h   |6 ++
 crypto/asymmetric_keys/public_key_trust.c |   94 +
 crypto/asymmetric_keys/x509_parser.h  |6 --
 crypto/asymmetric_keys/x509_public_key.c  |6 --
 include/crypto/public_key.h   |8 +-
 include/keys/asymmetric-subtype.h |4 +
 security/integrity/digsig_asymmetric.c|5 +-
 13 files changed, 108 insertions(+), 84 deletions(-)

diff --git a/certs/system_keyring.c b/certs/system_keyring.c
index e7f286413276..fbaaaea59f02 100644
--- a/certs/system_keyring.c
+++ b/certs/system_keyring.c
@@ -35,7 +35,8 @@ static __init int system_trusted_keyring_init(void)
keyring_alloc(".system_keyring",
  KUIDT_INIT(0), KGIDT_INIT(0), current_cred(),
  ((KEY_POS_ALL & ~KEY_POS_SETATTR) |
- KEY_USR_VIEW | KEY_USR_READ | KEY_USR_SEARCH),
+  KEY_USR_VIEW | KEY_USR_READ | KEY_USR_SEARCH |
+  KEY_USR_WRITE),
  KEY_ALLOC_NOT_IN_QUOTA, NULL);
if (IS_ERR(system_trusted_keyring))
panic("Can't allocate system trusted keyring\n");
diff --git a/crypto/asymmetric_keys/Makefile b/crypto/asymmetric_keys/Makefile
index bd07987c64e7..69bcdc9a2ce6 100644
--- a/crypto/asymmetric_keys/Makefile
+++ b/crypto/asymmetric_keys/Makefile
@@ -6,7 +6,7 @@ obj-$(CONFIG_ASYMMETRIC_KEY_TYPE) += asymmetric_keys.o
 
 asymmetric_keys-y := asymmetric_type.o signature.o
 
-obj-$(CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE) += public_key.o
+obj-$(CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE) += public_key.o public_key_trust.o
 obj-$(CONFIG_PUBLIC_KEY_ALGO_RSA) += rsa.o
 
 #
diff --git a/crypto/asymmetric_keys/asymmetric_keys.h 
b/crypto/asymmetric_keys/asymmetric_keys.h
index 1d450b580245..ca8e9ac34ce6 100644
--- a/crypto/asymmetric_keys/asymmetric_keys.h
+++ b/crypto/asymmetric_keys/asymmetric_keys.h
@@ -9,6 +9,8 @@
  * 2 of the Licence, or (at your option) any later version.
  */
 
+#include 
+
 extern struct asymmetric_key_id *asymmetric_key_hex_to_key_id(const char *id);
 
 extern int __asymmetric_key_hex_to_key_id(const char *id,
diff --git a/crypto/asymmetric_keys/asymmetric_type.c 
b/crypto/asymmetric_keys/asymmetric_type.c
index a79d30128821..e02cbd068151 100644
--- a/crypto/asymmetric_keys/asymmetric_type.c
+++ b/crypto/asymmetric_keys/asymmetric_type.c
@@ -362,10 +362,25 @@ static void asymmetric_key_destroy(struct key *key)
asymmetric_key_free_kids(kids);
 }
 
+/*
+ * Verify the trust on an asymmetric key when added to a trusted-only keyring.
+ * The keyring provides a list of keys to check against.
+ */
+static int asymmetric_key_verify_trust(const union key_payload *payload,
+  struct key *keyring)
+{
+   struct asymmetric_key_subtype *subtype = payload->data[asym_subtype];
+
+   pr_devel("==>%s()\n", __func__);
+
+   return subtype->verify_trust(payload, keyring);
+}
+
 struct key_type key_type_asymmetric = {
.name   = "asymmetric",
.preparse   = asymmetric_key_preparse,
.free_preparse  = asymmetric_key_free_preparse,
+   .verify_trust   = asymmetric_key_verify_trust,
.instantiate= generic_key_instantiate,
.match_preparse = asymmetric_key_match_preparse,
.match_free = asymmetric_ke

Re: [PATCH 1/6] KEYS: use kvfree() in add_key

2015-10-21 Thread David Howells

These patches can be found here also:


http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=keys-next

And tagged with:

keys-next-20151021

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 5/6] KEYS: Provide a script to extract a module signature

2015-10-21 Thread David Howells

The supplied script takes a signed module file and extracts the tailmost
signature (there could theoretically be more than one) and dumps all or
part of it or the unsigned file to stdout.

Call as:

scripts/extract-module-sig.pl -[0adnks] module-file >out

where the initial flag indicates which bit of the signed file you want dumping
to stdout:

 (*) "-0".  Dumps the unsigned data with the signature stripped.

 (*) "-a".  Dumps all of the signature data, including the magic number.

 (*) "-d".  Dumps the signature information block as a sequence of decimal
numbers in text form with spaces between (crypto algorithm type,
hash type, identifier type, signer's name length, key identifier
length and signature length).

 (*) "-n".  Dumps the signer's name contents.

 (*) "-k".  Dumps the key identifier contents.

 (*) "-s".  Dumps the cryptographic signature contents.

In the case that the signature is a PKCS#7 (or CMS) message, -n and -k will
print a warning to stderr and dump nothing to stdout, but will otherwise
complete okay; the entire PKCS#7/CMS message will be dumped by "-s"; and "-d"
will show "0 0 2 0 0 ".

Signed-off-by: David Howells 
---

 scripts/extract-module-sig.pl |  136 +
 1 file changed, 136 insertions(+)
 create mode 100755 scripts/extract-module-sig.pl

diff --git a/scripts/extract-module-sig.pl b/scripts/extract-module-sig.pl
new file mode 100755
index ..faac6f2e377f
--- /dev/null
+++ b/scripts/extract-module-sig.pl
@@ -0,0 +1,136 @@
+#!/usr/bin/perl -w
+#
+# extract-mod-sig  
+#
+# Reads the module file and writes out some or all of the signature
+# section to stdout.  Part is the bit to be written and is one of:
+#
+#  -0: The unsigned module, no signature data at all
+#  -a: All of the signature data, including magic number
+#  -d: Just the descriptor values as a sequence of numbers
+#  -n: Just the signer's name
+#  -k: Just the key ID
+#  -s: Just the crypto signature or PKCS#7 message
+#
+use strict;
+
+die "Format: $0 -[0adnks] module-file >out\n"
+if ($#ARGV != 1);
+
+my $part = $ARGV[0];
+my $modfile = $ARGV[1];
+
+my $magic_number = "~Module signature appended~\n";
+
+#
+# Read the module contents
+#
+open FD, "<$modfile" || die $modfile;
+binmode(FD);
+my @st = stat(FD);
+die "$modfile" unless (@st);
+my $buf = "";
+my $len = sysread(FD, $buf, $st[7]);
+die "$modfile" unless (defined($len));
+die "Short read on $modfile\n" unless ($len == $st[7]);
+close(FD) || die $modfile;
+
+print STDERR "Read ", $len, " bytes from module file\n";
+
+die "The file is too short to have a sig magic number and descriptor\n"
+if ($len < 12 + length($magic_number));
+
+#
+# Check for the magic number and extract the information block
+#
+my $p = $len - length($magic_number);
+my $raw_magic = substr($buf, $p);
+
+die "Magic number not found at $len\n"
+if ($raw_magic ne $magic_number);
+print STDERR "Found magic number at $len\n";
+
+$p -= 12;
+my $raw_info = substr($buf, $p, 12);
+
+my @info = unpack("CxxxN", $raw_info);
+my ($algo, $hash, $id_type, $name_len, $kid_len, $sig_len) = @info;
+
+if ($id_type == 0) {
+print STDERR "Found PGP key identifier\n";
+} elsif ($id_type == 1) {
+print STDERR "Found X.509 cert identifier\n";
+} elsif ($id_type == 2) {
+print STDERR "Found PKCS#7/CMS encapsulation\n";
+} else {
+print STDERR "Found unsupported identifier type $id_type\n";
+}
+
+#
+# Extract the three pieces of info data
+#
+die "Insufficient name+kid+sig data in file\n"
+unless ($p >= $name_len + $kid_len + $sig_len);
+
+$p -= $sig_len;
+my $raw_sig = substr($buf, $p, $sig_len);
+$p -= $kid_len;
+my $raw_kid = substr($buf, $p, $kid_len);
+$p -= $name_len;
+my $raw_name = substr($buf, $p, $name_len);
+
+my $module_len = $p;
+
+if ($sig_len > 0) {
+print STDERR "Found $sig_len bytes of signature [";
+my $n = $sig_len > 16 ? 16 : $sig_len;
+foreach my $i (unpack("C" x $n, substr($raw_sig, 0, $n))) {
+   printf STDERR "%02x", $i;
+}
+print STDERR "]\n";
+}
+
+if ($kid_len > 0) {
+print STDERR "Found $kid_len bytes of key identifier [";
+my $n = $kid_len > 16 ? 16 : $kid_len;
+foreach my $i (unpack("C" x $n, substr($raw_kid, 0, $n))) {
+   printf STDERR "%02x", $i;
+}
+print STDERR "]\n";
+}
+
+if ($name_len > 0) {
+print STDERR "Found $name_len bytes of signer's name [$raw_name]\n";
+}
+
+#
+# Produce the requested output
+#
+if ($part eq "-0") {
+# The unsigned module, no signature data at all
+

[PATCH 3/6] keys: Be more consistent in selection of union members used

2015-10-21 Thread David Howells

From: Insu Yun 

key->description and key->index_key.description are same because
they are unioned. But, for readability, using same name for
duplication and validation seems better.

Signed-off-by: Insu Yun 
Signed-off-by: David Howells 
---

 security/keys/key.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/security/keys/key.c b/security/keys/key.c
index aee2ec5a18fc..c0478465d1ac 100644
--- a/security/keys/key.c
+++ b/security/keys/key.c
@@ -278,7 +278,7 @@ struct key *key_alloc(struct key_type *type, const char 
*desc,
 
key->index_key.desc_len = desclen;
key->index_key.description = kmemdup(desc, desclen + 1, GFP_KERNEL);
-   if (!key->description)
+   if (!key->index_key.description)
goto no_memory_3;
 
atomic_set(&key->usage, 1);

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 4/6] KEYS: Provide a script to extract the sys cert list from a vmlinux file

2015-10-21 Thread David Howells

The supplied script takes a vmlinux file - and if necessary a System.map
file - locates the system certificates list and extracts it to the named
file.

Call as:

./scripts/extract-sys-certs vmlinux certs

if vmlinux contains symbols and:

./scripts/extract-sys-certs -s System.map vmlinux certs

if it does not.

It prints something like the following to stdout:

Have 27 sections
No symbols in vmlinux, trying System.map
Have 80088 symbols
Have 1346 bytes of certs at VMA 0x8201c540
Certificate list in section .init.data
Certificate list at file offset 0x141c540

If vmlinux contains symbols then that is used rather than System.map - even
if one is given.

Signed-off-by: David Howells 
---

 scripts/extract-sys-certs.pl |  144 ++
 1 file changed, 144 insertions(+)
 create mode 100755 scripts/extract-sys-certs.pl

diff --git a/scripts/extract-sys-certs.pl b/scripts/extract-sys-certs.pl
new file mode 100755
index ..d476e7d1fd88
--- /dev/null
+++ b/scripts/extract-sys-certs.pl
@@ -0,0 +1,144 @@
+#!/usr/bin/perl -w
+#
+use strict;
+use Math::BigInt;
+use Fcntl "SEEK_SET";
+
+die "Format: $0 [-s ]  \n"
+if ($#ARGV != 1 && $#ARGV != 3 ||
+   $#ARGV == 3 && $ARGV[0] ne "-s");
+
+my $sysmap = "";
+if ($#ARGV == 3) {
+shift;
+$sysmap = $ARGV[0];
+shift;
+}
+
+my $vmlinux = $ARGV[0];
+my $keyring = $ARGV[1];
+
+#
+# Parse the vmlinux section table
+#
+open FD, "objdump -h $vmlinux |" || die $vmlinux;
+my @lines = ;
+close(FD) || die $vmlinux;
+
+my @sections = ();
+
+foreach my $line (@lines) {
+chomp($line);
+if ($line =~ 
/\s*([0-9]+)\s+(\S+)\s+([0-9a-f]+)\s+([0-9a-f]+)\s+([0-9a-f]+)\s+([0-9a-f]+)\s+2[*][*]([0-9]+)/
+   ) {
+   my $seg  = $1;
+   my $name = $2;
+   my $len  = Math::BigInt->new("0x" . $3);
+   my $vma  = Math::BigInt->new("0x" . $4);
+   my $lma  = Math::BigInt->new("0x" . $5);
+   my $foff = Math::BigInt->new("0x" . $6);
+   my $align = 2 ** $7;
+
+   push @sections, { name => $name,
+ vma => $vma,
+ len => $len,
+ foff => $foff };
+}
+}
+
+print "Have $#sections sections\n";
+
+#
+# Try and parse the vmlinux symbol table.  If the vmlinux file has been created
+# from a vmlinuz file with extract-vmlinux then the symbol table will be empty.
+#
+open FD, "nm $vmlinux 2>/dev/null |" || die $vmlinux;
+@lines = ;
+close(FD) || die $vmlinux;
+
+my %symbols = ();
+my $nr_symbols = 0;
+
+sub parse_symbols(@) {
+foreach my $line (@_) {
+   chomp($line);
+   if ($line =~ /([0-9a-f]+)\s([a-zA-Z])\s(\S+)/
+   ) {
+   my $addr = "0x" . $1;
+   my $type = $2;
+   my $name = $3;
+
+   $symbols{$name} = $addr;
+   $nr_symbols++;
+   }
+}
+}
+parse_symbols(@lines);
+
+if ($nr_symbols == 0 && $sysmap ne "") {
+print "No symbols in vmlinux, trying $sysmap\n";
+
+open FD, "<$sysmap" || die $sysmap;
+@lines = ;
+close(FD) || die $sysmap;
+parse_symbols(@lines);
+}
+
+die "No symbols available\n"
+if ($nr_symbols == 0);
+
+print "Have $nr_symbols symbols\n";
+
+die "Can't find system certificate list"
+unless (exists($symbols{"__cert_list_start"}) &&
+   exists($symbols{"__cert_list_end"}));
+
+my $start = Math::BigInt->new($symbols{"__cert_list_start"});
+my $end = Math::BigInt->new($symbols{"__cert_list_end"});
+my $size = $end - $start;
+
+printf "Have %u bytes of certs at VMA 0x%x\n", $size, $start;
+
+my $s = undef;
+foreach my $sec (@sections) {
+my $s_name = $sec->{name};
+my $s_vma = $sec->{vma};
+my $s_len = $sec->{len};
+my $s_foff = $sec->{foff};
+my $s_vend = $s_vma + $s_len;
+
+next unless ($start >= $s_vma);
+next if ($start >= $s_vend);
+
+die "Cert object partially overflows section $s_name\n"
+   if ($end > $s_vend);
+
+die "Cert object in multiple sections: ", $s_name, " and ", $s->{name}, 
"\n"
+   if ($s);
+$s = $sec;
+}
+
+die "Cert object not inside a section\n"
+unless ($s);
+
+print "Certificate list in section ", $s->{name}, "\n";
+
+my $foff = $start - $s->{vma} + $s->{foff};
+
+printf "Certificate list at file offset 0x%x\n", $foff;
+
+open FD, "<$vmlinux" || die $vmlinux;
+binmode(FD);
+die $vmlinux if (!defined(sysseek(FD, $foff, SEEK_SET)));
+my $buf = "";
+my $len = sysread(FD, $buf, $size);
+die "$vmlinux" if (!defined($len));
+die &q

[PATCH 2/6] certs: add .gitignore to stop git nagging about x509_certificate_list

2015-10-21 Thread David Howells

From: Paul Gortmaker 

Currently we see this in "git status" if we build in the source dir:

Untracked files:
  (use "git add ..." to include in what will be committed)

certs/x509_certificate_list

It looks like it used to live in kernel/ so we squash that .gitignore
entry at the same time.  I didn't bother to dig through git history to
see when it moved, since it is just a minor annoyance at most.

Cc: David Woodhouse 
Cc: keyri...@linux-nfs.org
Signed-off-by: Paul Gortmaker 
Signed-off-by: David Howells 
---

 certs/.gitignore  |4 
 kernel/.gitignore |1 -
 2 files changed, 4 insertions(+), 1 deletion(-)
 create mode 100644 certs/.gitignore

diff --git a/certs/.gitignore b/certs/.gitignore
new file mode 100644
index ..f51aea4a71ec
--- /dev/null
+++ b/certs/.gitignore
@@ -0,0 +1,4 @@
+#
+# Generated files
+#
+x509_certificate_list
diff --git a/kernel/.gitignore b/kernel/.gitignore
index 790d83c7d160..b3097bde4e9c 100644
--- a/kernel/.gitignore
+++ b/kernel/.gitignore
@@ -5,4 +5,3 @@ config_data.h
 config_data.gz
 timeconst.h
 hz.bc
-x509_certificate_list

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 6/6] KEYS: Merge the type-specific data with the payload data

2015-10-21 Thread David Howells

Merge the type-specific data with the payload data into one four-word chunk
as it seems pointless to keep them separate.

Use user_key_payload() for accessing the payloads of overloaded
user-defined keys.

Signed-off-by: David Howells 
cc: linux-c...@vger.kernel.org
cc: ecryp...@vger.kernel.org
cc: linux-e...@vger.kernel.org
cc: linux-f2fs-de...@lists.sourceforge.net
cc: linux-...@vger.kernel.org
cc: ceph-de...@vger.kernel.org
cc: linux-ima-de...@lists.sourceforge.net
---

 Documentation/crypto/asymmetric-keys.txt |   27 +++--
 Documentation/security/keys.txt  |   41 ---
 crypto/asymmetric_keys/asymmetric_keys.h |5 --
 crypto/asymmetric_keys/asymmetric_type.c |   44 -
 crypto/asymmetric_keys/public_key.c  |4 +-
 crypto/asymmetric_keys/signature.c   |2 -
 crypto/asymmetric_keys/x509_parser.h |1 
 crypto/asymmetric_keys/x509_public_key.c |9 ++--
 fs/cifs/cifs_spnego.c|6 +--
 fs/cifs/cifsacl.c|   25 ++--
 fs/cifs/connect.c|9 ++--
 fs/cifs/sess.c   |2 -
 fs/cifs/smb2pdu.c|2 -
 fs/ecryptfs/ecryptfs_kernel.h|5 +-
 fs/ext4/crypto_key.c |4 +-
 fs/f2fs/crypto_key.c |4 +-
 fs/fscache/object-list.c |4 +-
 fs/nfs/nfs4idmap.c   |4 +-
 include/crypto/public_key.h  |1 
 include/keys/asymmetric-subtype.h|2 -
 include/keys/asymmetric-type.h   |   15 +++
 include/keys/user-type.h |8 
 include/linux/key-type.h |3 -
 include/linux/key.h  |   33 +++
 kernel/module_signing.c  |1 
 lib/digsig.c |7 ++-
 net/ceph/ceph_common.c   |2 -
 net/ceph/crypto.c|6 +--
 net/dns_resolver/dns_key.c   |   20 +
 net/dns_resolver/dns_query.c |7 +--
 net/dns_resolver/internal.h  |8 
 net/rxrpc/af_rxrpc.c |2 -
 net/rxrpc/ar-key.c   |   32 +++
 net/rxrpc/ar-output.c|2 -
 net/rxrpc/ar-security.c  |4 +-
 net/rxrpc/rxkad.c|   16 ---
 security/integrity/evm/evm_crypto.c  |2 -
 security/keys/big_key.c  |   47 +++---
 security/keys/encrypted-keys/encrypted.c |   18 
 security/keys/encrypted-keys/encrypted.h |4 +-
 security/keys/encrypted-keys/masterkey_trusted.c |4 +-
 security/keys/key.c  |   18 
 security/keys/keyctl.c   |4 +-
 security/keys/keyring.c  |   12 +++---
 security/keys/process_keys.c |4 +-
 security/keys/request_key.c  |4 +-
 security/keys/request_key_auth.c |   12 +++---
 security/keys/trusted.c  |6 +--
 security/keys/user_defined.c |   14 +++
 49 files changed, 286 insertions(+), 230 deletions(-)

diff --git a/Documentation/crypto/asymmetric-keys.txt 
b/Documentation/crypto/asymmetric-keys.txt
index b7675904a747..8c07e0ea6bc0 100644
--- a/Documentation/crypto/asymmetric-keys.txt
+++ b/Documentation/crypto/asymmetric-keys.txt
@@ -186,7 +186,7 @@ and looks like the following:
const struct public_key_signature *sig);
};
 
-Asymmetric keys point to this with their type_data[0] member.
+Asymmetric keys point to this with their payload[asym_subtype] member.
 
 The owner and name fields should be set to the owning module and the name of
 the subtype.  Currently, the name is only used for print statements.
@@ -269,8 +269,7 @@ mandatory:
 
struct key_preparsed_payload {
char*description;
-   void*type_data[2];
-   void*payload;
+   void*payload[4];
const void  *data;
size_t  datalen;
size_t  quotalen;
@@ -283,16 +282,18 @@ mandatory:
  not theirs.
 
  If the parser is happy with the blob, it should propose a description for
- the key and attach it to ->description, ->type_data[0] should be set to
- point to the subtype to be used, ->payload should be set to point to the
- initialised data for that subtype, ->

[PATCH 1/6] KEYS: use kvfree() in add_key

2015-10-21 Thread David Howells

From: Geliang Tang 

There is no need to make a flag to tell that this memory is allocated by
kmalloc or vmalloc. Just use kvfree to free the memory.

Signed-off-by: Geliang Tang 
Signed-off-by: David Howells 
---

 security/keys/keyctl.c |8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c
index 0b9ec78a7a7a..6110fa498494 100644
--- a/security/keys/keyctl.c
+++ b/security/keys/keyctl.c
@@ -67,7 +67,6 @@ SYSCALL_DEFINE5(add_key, const char __user *, _type,
char type[32], *description;
void *payload;
long ret;
-   bool vm;
 
ret = -EINVAL;
if (plen > 1024 * 1024 - 1)
@@ -98,14 +97,12 @@ SYSCALL_DEFINE5(add_key, const char __user *, _type,
/* pull the payload in if one was supplied */
payload = NULL;
 
-   vm = false;
if (_payload) {
ret = -ENOMEM;
payload = kmalloc(plen, GFP_KERNEL | __GFP_NOWARN);
if (!payload) {
if (plen <= PAGE_SIZE)
goto error2;
-   vm = true;
payload = vmalloc(plen);
if (!payload)
goto error2;
@@ -138,10 +135,7 @@ SYSCALL_DEFINE5(add_key, const char __user *, _type,
 
key_ref_put(keyring_ref);
  error3:
-   if (!vm)
-   kfree(payload);
-   else
-   vfree(payload);
+   kvfree(payload);
  error2:
kfree(description);
  error:

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 2/3] Create IMA machine owner keys (MOK) and blacklist keyrings;

2015-10-21 Thread David Howells

Petko Manolov  wrote:

> > > As far as i know there is no concept of write-once to a keyring in the
> > > kernel.  David will correct me if i am wrong.  I wonder how hard would
> > > it be to add such functionality, in case it is missing?
> > 
> > Not hard, particularly if it's only an attribute that the kernel can set.
> 
> Definitely kernel-only.  The other way does not appeal to me in terms of 
> security.

Nor me in terms of letting userspace lock keys into the kernel arbitrarily.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 2/3] Create IMA machine owner keys (MOK) and blacklist keyrings;

2015-10-21 Thread David Howells

Mimi Zohar  wrote:

> > I need to think about this.  Should -EKEYREVOKED be the same as -ENOKEY in
> > this case?  I guess the end result is pretty much the same from IMA view
> > point, but there may be a requirement to list all revoked keys...
> 
> When checking the blacklist, getting -EKEYREVOKED is definitely
> different than -ENOKEY.

Actually, I misspoke earlier.  Revoked keys are only skipped by the search if
a live key is found.  Should all the keys in the blacklist just be revoked so
that the search of the list returns either -ENOKEY (no key there) or
-EKEYREVOKED (the key is blacklisted)?  That might be getting too
over-complicated though.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 2/3] Create IMA machine owner keys (MOK) and blacklist keyrings;

2015-10-21 Thread David Howells

Petko Manolov  wrote:

> As far as i know there is no concept of write-once to a keyring in the
> kernel.  David will correct me if i am wrong.  I wonder how hard would it be
> to add such functionality, in case it is missing?

Not hard, particularly if it's only an attribute that the kernel can set.

> Ideally a revoked key should stay in .blacklist until it expire or the
> system is rebooted.

That's not quite sufficient.  Search would also need to be modified otherwise
the revoked key would be skipped.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 2/3] Create IMA machine owner keys (MOK) and blacklist keyrings;

2015-10-21 Thread David Howells

Mimi Zohar  wrote:

> Thinking about the blacklist keyring some more...

Are we talking about a blacklist keyring that userspace can use - or can it be
only usable by the kernel?

> My concern is more that keys can be added and removed at run time from
> either of the .ima or the ima_mok keyrings.  The need for a blacklist
> keyring is to prevent the key from being removed and at a later point
> re-added.  Unfortunately, keys can be added and removed similarly from the
> blacklist keyring as well.  Unless keys can be added, without the ability of
> removing them, I'm not sure of the benefit of a blacklist keyring.  I assume
> adding and removing keys requires the same write privilege.

The operations that modify the contents of a keyring in some way (link,
unlink, clear) all operate under Write privilege.  That said, we could add a
flag that suppresses unlink and clear on a keyring.  This could also suppress
garbage collection of revoked and invalidated keys.

Note, however, that keyring searches also skip revoked and invalidated keys,
so that would also need dealing with.

> (We previously resolved the problem of keyrings being removed by
> userspace, even by a privileged user, by dot prefixing the keyrings.)

That doesn't stop keys being addressed directly for invalidation and
revocation, but you can probably manage that with permissions.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 2/2] KEYS: Don't permit request_key() to construct a new keyring

2015-10-19 Thread David Howells

If request_key() is used to find a keyring, only do the search part - don't
do the construction part if the keyring was not found by the search.  We
don't really want keyrings in the negative instantiated state since the
rejected/negative instantiation error value in the payload is unioned with
keyring metadata.

Now the kernel gives an error:

request_key("keyring", "#selinux,bdekeyring", "keyring", 
KEY_SPEC_USER_SESSION_KEYRING) = -1 EPERM (Operation not permitted)

Signed-off-by: David Howells 
---

 security/keys/request_key.c |3 +++
 1 file changed, 3 insertions(+)

diff --git a/security/keys/request_key.c b/security/keys/request_key.c
index 486ef6fa393b..0d6253124278 100644
--- a/security/keys/request_key.c
+++ b/security/keys/request_key.c
@@ -440,6 +440,9 @@ static struct key *construct_key_and_link(struct 
keyring_search_context *ctx,
 
kenter("");
 
+   if (ctx->index_key.type == &key_type_keyring)
+   return ERR_PTR(-EPERM);
+   
user = key_user_lookup(current_fsuid());
if (!user)
return ERR_PTR(-ENOMEM);

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 1/2] KEYS: Fix crash when attempt to garbage collect an uninstantiated keyring

2015-10-19 Thread David Howells

The following sequence of commands:

i=`keyctl add user a a @s`
keyctl request2 keyring foo bar @t
keyctl unlink $i @s

tries to invoke an upcall to instantiate a keyring if one doesn't already
exist by that name within the user's keyring set.  However, if the upcall
fails, the code sets keyring->type_data.reject_error to -ENOKEY or some
other error code.  When the key is garbage collected, the key destroy
function is called unconditionally and keyring_destroy() uses list_empty()
on keyring->type_data.link - which is in a union with reject_error.
Subsequently, the kernel tries to unlink the keyring from the keyring names
list - which oopses like this:

BUG: unable to handle kernel paging request at ff8a
IP: [] keyring_destroy+0x3d/0x88
...
Workqueue: events key_garbage_collector
...
RIP: 0010:[] keyring_destroy+0x3d/0x88
RSP: 0018:88003e2f3d30  EFLAGS: 00010203
RAX: ff82 RBX: 88003bf1a900 RCX: 
RDX:  RSI: 3bfc6901 RDI: 81a73a40
RBP: 88003e2f3d38 R08: 0152 R09: 
R10: 88003e2f3c18 R11: 865b R12: 88003bf1a900
R13:  R14: 88003bf1a908 R15: 88003e2f4000
...
CR2: ff8a CR3: 3e3ec000 CR4: 06f0
...
Call Trace:
 [] key_gc_unused_keys.constprop.1+0x5d/0x10f
 [] key_garbage_collector+0x1fa/0x351
 [] process_one_work+0x28e/0x547
 [] worker_thread+0x26e/0x361
 [] ? rescuer_thread+0x2a8/0x2a8
 [] kthread+0xf3/0xfb
 [] ? kthread_create_on_node+0x1c2/0x1c2
 [] ret_from_fork+0x3f/0x70
 [] ? kthread_create_on_node+0x1c2/0x1c2

Note the value in RAX.  This is a 32-bit representation of -ENOKEY.

The solution is to only call ->destroy() if the key was successfully
instantiated.

Reported-by: Dmitry Vyukov 
Signed-off-by: David Howells 
Tested-by: Dmitry Vyukov 
---

 security/keys/gc.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/security/keys/gc.c b/security/keys/gc.c
index 39eac1fd5706..addf060399e0 100644
--- a/security/keys/gc.c
+++ b/security/keys/gc.c
@@ -134,8 +134,10 @@ static noinline void key_gc_unused_keys(struct list_head 
*keys)
kdebug("- %u", key->serial);
key_check(key);
 
-   /* Throw away the key data */
-   if (key->type->destroy)
+   /* Throw away the key data if the key is instantiated */
+   if (test_bit(KEY_FLAG_INSTANTIATED, &key->flags) &&
+   !test_bit(KEY_FLAG_NEGATIVE, &key->flags) &&
+   key->type->destroy)
key->type->destroy(key);
 
security_key_free(key);

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 0/2] KEYS: Fix crash in GC

2015-10-19 Thread David Howells


Hi James, Linus,

Here are two patches, the first of which at least should go upstream
immediately:

 (1) Prevent a user-triggerable crash in the keyrings destructor when a
 negatively instantiated keyring is garbage collected.  I have also seen
 this triggered for user type keys.

 (2) Prevent the user from using requesting that a keyring be created and
 instantiated through an upcall.  Doing so is probably safe since the
 keyring type ignores the arguments to its instantiation function - but we
 probably shouldn't let keyrings be created in this manner.

I'm okay with patch (2) being deferred to the next merge window if we're only
fixing security bugs at this time upstream.

The patches can be found here also:


http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=keys-fixes

David
---
David Howells (2):
  KEYS: Fix crash when attempt to garbage collect an uninstantiated keyring
  KEYS: Don't permit request_key() to construct a new keyring


 security/keys/gc.c  |6 --
 security/keys/request_key.c |3 +++
 2 files changed, 7 insertions(+), 2 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: GPF in keyring_destroy

2015-10-19 Thread David Howells

Dmitry Vyukov  wrote:

> Do you mean in addition or instead of the previous one? From your
> description, it sounds like it alone should prevent the crash.

I'm going to submit them both, so if you could test them together.  You're
right, though, I think this should also prevent the crash.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: GPF in keyring_destroy

2015-10-19 Thread David Howells

Dmitry Vyukov  wrote:

> > Does the attached patch fix it for you?
> 
> Yes, it fixes the crash for me.

I have an additional patch to prevent keyrings from being constructed by
request_key() at all (though it can still search for them).  Could you give
this a spin in addition to the previous one also?

Thanks,
David
---
commit 27874345bb8d2c39f3d493607a86ecbfcb100405
Author: David Howells 
Date:   Mon Oct 19 11:20:28 2015 +0100

KEYS: Don't permit request_key() to construct a new keyring

If request_key() is used to find a keyring, only do the search part - don't
do the construction part if the keyring was not found by the search.  We
don't really want keyrings in the negative instantiated state since the
rejected/negative instantiation error value in the payload is unioned with
keyring metadata.

Signed-off-by: David Howells 

diff --git a/security/keys/request_key.c b/security/keys/request_key.c
index 486ef6fa393b..0d6253124278 100644
--- a/security/keys/request_key.c
+++ b/security/keys/request_key.c
@@ -440,6 +440,9 @@ static struct key *construct_key_and_link(struct 
keyring_search_context *ctx,

kenter("");

+   if (ctx->index_key.type == &key_type_keyring)
+   return ERR_PTR(-EPERM);
+   
user = key_user_lookup(current_fsuid());
if (!user)
return ERR_PTR(-ENOMEM);
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: GPF in keyring_destroy

2015-10-19 Thread David Howells

Dmitry Vyukov  wrote:

> Yes, sure. Do I need to say something like:
> 
> Tested-by: Dmitry Vyukov 
> 
> in future?

That helps:-)

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: GPF in keyring_destroy

2015-10-19 Thread David Howells

Dmitry Vyukov  wrote:

> > Does the attached patch fix it for you?
> 
> Yes, it fixes the crash for me.

Can I put you down as a Tested-by?

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 0/4] Basic trusted keys support for TPM 2.0

2015-10-16 Thread David Howells

Hi Jarkko,

For some reason I don't see patch 1.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] keys: change member variable name

2015-10-15 Thread David Howells

Insu Yun  wrote:

> key->description and key->index_key.description are same because 
> they are unioned. But, for readability, using same name for
> duplication and validation seems better.
> 
> Signed-off-by: Insu Yun 

I've applied this, but I've changed the subject line to:

keys: Be more consistent in selection of union members used

if you're okay with that.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] keys: correctly check failed allocation for kmemdup

2015-10-15 Thread David Howells

Insu Yun  wrote:

> Thanks David. Then it is not a bug.
> It's a pure question. 
> Why use different name for allocation and check?
> For me, it is quite confusing. 

Either I didn't notice at the time, or the shorter variant is the original.

If you want to give me a patch making it consistent, feel free.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: GPF in keyring_destroy

2015-10-15 Thread David Howells

Does the attached patch fix it for you?

David
---
commit a7609e0bb3973d6ee3c9f1ecd0b6a382d99d6248
Author: David Howells 
Date:   Thu Oct 15 17:21:37 2015 +0100

KEYS: Fix crash when attempt to garbage collect an uninstantiated keyring

The following sequence of commands:

i=`keyctl add user a a @s`
keyctl request2 keyring foo bar @t
keyctl unlink $i @s

tries to invoke an upcall to instantiate a keyring if one doesn't already
exist by that name within the user's keyring set.  However, if the upcall
fails, the code sets keyring->type_data.reject_error to -ENOKEY or some
other error code.  When the key is garbage collected, the key destroy
function is called unconditionally and keyring_destroy() uses list_empty()
on keyring->type_data.link - which is in a union with reject_error.
Subsequently, the kernel tries to unlink the keyring from the keyring names
list - which oopses like this:

BUG: unable to handle kernel paging request at ff8a
IP: [] keyring_destroy+0x3d/0x88
...
Workqueue: events key_garbage_collector
...
RIP: 0010:[] keyring_destroy+0x3d/0x88
RSP: 0018:88003e2f3d30  EFLAGS: 00010203
RAX: ff82 RBX: 88003bf1a900 RCX: 
RDX:  RSI: 3bfc6901 RDI: 81a73a40
RBP: 88003e2f3d38 R08: 0152 R09: 
R10: 88003e2f3c18 R11: 865b R12: 88003bf1a900
R13:  R14: 88003bf1a908 R15: 88003e2f4000
...
CR2: ff8a CR3: 3e3ec000 CR4: 06f0
...
Call Trace:
 [] key_gc_unused_keys.constprop.1+0x5d/0x10f
 [] key_garbage_collector+0x1fa/0x351
 [] process_one_work+0x28e/0x547
 [] worker_thread+0x26e/0x361
 [] ? rescuer_thread+0x2a8/0x2a8
 [] kthread+0xf3/0xfb
 [] ? kthread_create_on_node+0x1c2/0x1c2
 [] ret_from_fork+0x3f/0x70
 [] ? kthread_create_on_node+0x1c2/0x1c2

Note the value in RAX.  This is a 32-bit representation of -ENOKEY.

The solution is to only call ->destroy() if the key was successfully
instantiated.

Reported-by: Dmitry Vyukov 
Signed-off-by: David Howells 

diff --git a/security/keys/gc.c b/security/keys/gc.c
index 39eac1fd5706..addf060399e0 100644
--- a/security/keys/gc.c
+++ b/security/keys/gc.c
@@ -134,8 +134,10 @@ static noinline void key_gc_unused_keys(struct list_head 
*keys)
kdebug("- %u", key->serial);
key_check(key);
 
-   /* Throw away the key data */
-   if (key->type->destroy)
+   /* Throw away the key data if the key is instantiated */
+   if (test_bit(KEY_FLAG_INSTANTIATED, &key->flags) &&
+   !test_bit(KEY_FLAG_NEGATIVE, &key->flags) &&
+   key->type->destroy)
key->type->destroy(key);
 
security_key_free(key);
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH] keys: correctly check failed allocation for kmemdup

2015-10-15 Thread David Howells

Insu Yun  wrote:

> kmemdup return value is saved in 'key->index_key.description', not
> 'key->descrption' and kmemdup can be failed in memory pressure.
> Therefore, key->index_key.description should be checked.

The fields are unioned.  It makes no difference.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: GPF in keyring_destroy

2015-10-15 Thread David Howells

Dmitry Vyukov  wrote:

> RAX: ff82

This is the value that matters.  It would appear to be -ENOKEY and would be in
key->type_data.reject_error, I think.

David
--
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/37] Permit filesystem local caching

2008-02-26 Thread David Howells

Daniel Phillips <[EMAIL PROTECTED]> wrote:

> I need to respond to this in pieces... first the bit that is bugging
> me:
> 
> > >   * two new page flags
> > 
> > I need to keep track of two bits of per-cached-page information:
> > 
> >  (1) This page is known by the cache, and that the cache must be informed if
> >  the page is going to go away.
> 
> I still do not understand the life cycle of this bit.  What does the
> cache do when it learns the page has gone away?

That's up to the cache.  CacheFS, for example, unpins some resources when all
the pages managed by a pointer block are taken away from it.  The cache may
also reserve a block on disk to back this page, and that reservation may then
be discarded by the netfs uncaching the page.

The cache may also speculatively take copies of the page if the machine is
idle.

Documentation/filesystems/caching/netfs-api.txt describes the caching API as a
process, including the presentation of netfs pages to the cache and their
uncaching.

> How is it informed?

[Documentation/filesystems/caching/netfs-api.txt]
==
PAGE UNCACHING
==

To uncache a page, this function should be called:

void fscache_uncache_page(struct fscache_cookie *cookie,
  struct page *page);

This function permits the cache to release any in-memory representation it
might be holding for this netfs page.  This function must be called once for
each page on which the read or write page functions above have been called to
make sure the cache's in-memory tracking information gets torn down.

Note that pages can't be explicitly deleted from the data file.  The whole
data file must be retired (see the relinquish cookie function below).

Furthermore, note that this does not cancel the asynchronous read or write
operation started by the read/alloc and write functions.
[/]

> Who owns the page cache in which such a page lives, the nfs client?
> Filesystem that hosts the page?  A third page cache owned by the
> cache itself?  (See my basic confusion about how many page cache
> levels you have, below.)

[Documentation/filesystems/caching/fscache.txt]
 (7) Data I/O is done direct to and from the netfs's pages.  The netfs
 indicates that page A is at index B of the data-file represented by cookie
 C, and that it should be read or written.  The cache backend may or may
 not start I/O on that page, but if it does, a netfs callback will be
 invoked to indicate completion.  The I/O may be either synchronous or
 asynchronous.
[/]

I should perhaps make the documentation more explicit: the pages passed to the
routines defined in include/linux/fscache.h are netfs pages, normally belonging
the pagecache of the appropriate netfs inode.  This is, however, mentioned in
the function banner comments in fscache.h.

> Suppose one were to take a mundane approach to the persistent cache
> problem instead of layering filesystems.  What you would do then is
> change NFS's ->write_page and variants to fiddle the persistent
> cache

It is a requirement laid down by the Linux NFS fs maintainers that the writes
to the cache be asynchronous, even if the writes to NFS aren't.

Note further that NFS's write_page() != writing to the cache.  Writing to the
cache is typically done by NFS's readpages().

Besides, at the moment, caching is suppressed for any NFS file opened for
writing due to coherency issues.  This is something to be revisited later.

> as well as the network, instead of just the network as now.

Not as now.  See above.

> This fiddling could even consist of ->write calls to another
> filesystem, though working directly with the bio interface would
> yield the fastest, and therefore to my mind, best result.

You can't necessarily access the BIO interface, and even if you can, the cache
is still a filesystem.

Essentially, what cachefiles does is to do what you say: to perform ->write
calls on another filesystem.

FS-Cache also protects the netfs against (a) there being no cache, (b) the
cache suffering a fatal I/O error and (c) the cache being removed; and protects
the cache against (d) the netfs uncaching pages that the cache is using and (e)
conflicting operations from the netfs, some of which may be queued for
asynchronous processing.

FS-Cache also groups asynchronous netfs store requests together, which
hopefully, one day, I'll be able to pass on to the backing fs.

> In any case, you find out how to write the page to backing store by
> asking the filesystem, which in the naive approach would be nfs
> augmented with caching library calls.

NFS and AFS and CIFS and ISOFS, but yes, that's what fscache is, if you like, a
caching library.

> The filesystem keeps its own metadata around to know how to map the page to
> disk.  So again naively, this metadata could tell the nfs client that the
> page is not mapped to disk at all.

The netfs should _not_ know about the metadata of a backing fs.  Firstly, there
are many different potent

Re: [PATCH 00/37] Permit filesystem local caching

2008-02-25 Thread David Howells

Daniel Phillips <[EMAIL PROTECTED]> wrote:

> On Monday 25 February 2008 15:19, David Howells wrote:
> > So I guess there's a problem in cachefiles's efficiency - possibly due
> > to the fact that it tries to be fully asynchronous.
> 
> OK, not just my imagination, and it makes me feel better about the patch 
> set because efficiency bugs are fixable while fundamental limitations 
> are not.

One can hope:-)

> How much of a hurry are you in to merge this feature?  You have bits 
> like this:

I'd like to get it upstream sooner rather than later.  As it's not upstream,
but it's prerequisite patches touch a lot of code, I have to spend time
regularly making my patches work again.  Merge windows are completely not fun.

> "Add a function to install a monitor on the page lock waitqueue for a 
> particular page, thus allowing the page being unlocked to be detected.
> This is used by CacheFiles to detect read completion on a page in the 
> backing filesystem so that it can then copy the data to the waiting 
> netfs page."
> 
> We already have that hook, it is called bio_endio.

Except that isn't accessible.  CacheFiles currently has no access to the
notification from the blockdev to the backing fs, if indeed there is one.  All
we can do it trap the backing fs page becoming available.

> My strong intuition is that your whole mechanism should sit directly on the
> block device, no matter how attractive it seems to be able to piggyback on
> the namespace and layout management code of existing filesystems.

There's a place for both.

Consider a laptop with a small disk, possibly subdivided between Linux and
Windows.  Linux then subdivides its bit further to get a swap space.  What you
then propose is to break off yet another chunk to provide the cache.  You
can't then use this other chunk for anything else, even if it's, say, 1% used
by the cache.

The way CacheFiles works is that you tell it that it can use up to a certain
percentage of the otherwise free disk space on an otherwise existing
filesystem.  In the laptop case, you may just have a single big partition.  The
cache will fill up as much of it can, and as the other contents of the
partition consume space, the cache will be culled to make room.

On the other hand, a system like my desktop, where I can slap in extra disks
with mound of extra disk space, it might very well make sense to commit block
devices to caching, as this can be used to gain performance.

I have another cache backend (CacheFS) which takes the form of a filesystem,
thus allowing you to mount a blockdev as a cache.  It's much faster than Ext3
at storing and retrieving files... at first.  The problem is that I've mucked
up the free space retrieval such that performance degrades by 20x over time for
files of any size.

Basically any cache on a raw blockdev _is_ a filesystem, just one in which
you're randomly allowed to discard data to make life easier.

> I see your current effort as the moral equivalent of FUSE: you are able to
> demonstrate certain desirable behavioral properties, but you are unable to
> reach full theoretical efficiency because there are layers and layers of
> interface gunk interposed between the netfs user and the cache device.

The interface gunk is meant to be as thin as possible, but there are
constraints (see the documentation in the main FS-Cache patch for more
details):

 (1) It's a requirement that it only be tied to, say, AFS.  We might have
 several netfs's that want caching: AFS, CIFS, ISOFS (okay, that last isn't
 really a netfs, but it might still want caching).

 (2) I want to be able to change the backing cache.  Under some circumstances I
 might want to use an existing filesystem, under others I might want to
 commit a blockdev.  I've even been asked about using battery-backed RAM -
 which has different design constraints.

 (3) The constraint has been imposed by the NFS team that the cache be
 completely asynchronous.  I haven't quite met this: readpages() will wait
 until the cache knows whether or not the pages are available on the
 principle that read operations done through the cache can be considered
 synchronous.  This is an attempt to reduce the context switchage involved.

Unfortunately, the asynchronicity requirement has caused the middle layer to
bloat.  Fortunately, the backing cache needn't bloat as it can use the middle
layer's bloat.

> That said, I also see you have put a huge amount of work into this over 
> the years, it is nicely broken out, you are responsive and easy to work 
> with, all arguments for an early merge.  Against that, you invade core 
> kernel for reasons that are not necessarily justified:
>
>   * two new page flags

I need to keep track of two bits of per-cached-page information:

Re: [PATCH 00/37] Permit filesystem local caching

2008-02-25 Thread David Howells

Daniel Phillips <[EMAIL PROTECTED]> wrote:

> This factor of four (even worse on XFS, not quite as bad on Ext3) is
> worth ruminating upon.  Is all of the difference explained by avoiding
> seeks on the server, which has the files in memory?

Here are some more stats for you to consider:

 (1) Copy the data across the network to a fresh Ext3 fs on the same partition
 I was using for the cache:

[EMAIL PROTECTED] ~]# time cp -a /warthog/aaa /var/fscache
real0m39.052s
user0m0.368s
sys 0m15.229s

 (2) Reboot and read back the files just written into Ext3 on the local disk:

[EMAIL PROTECTED] ~]# time tar cf - /var/fscache/aaa >/dev/zero
real0m40.574s
user0m0.164s
sys 0m3.512s

 (3) Run through the cache population process, and then run a tar directly on
 cachefiles's cache directly after a reboot:

[EMAIL PROTECTED] ~]# time tar cf - /var/fscache/cache >/dev/zero
real4m53.104s
user0m0.192s
sys 0m4.240s

So I guess there's a problem in cachefiles's efficiency - possibly due to the
fact that it tries to be fully asynchronous.

In case (1) this is very similar to the time for a read through a completely
cold cache (37.497s).

In case (2) this is comparable to cachefiles with a cache warmed prior to a
reboot (1m54.350s); in this case, however, cachefiles is doing some extra work:

 (a) It's doing a lookup on the server for each file, in addition to the
 lookups on the disk.  However, just doing a tar from plain NFS, the
 command completes in 22.330s.

 (b) It's reading an xattr per object for cache coherency management.

 (c) As the cache knows nothing of directories, files, etc., it lays its
 directory subtree out in a way that suits it.  File lookup keys are
 turned into filenames.  This may result in a less efficient arrangement
 in the cache than the original data, especially as directories may become
 very large, so Ext3 may be doing some extra work.

In case (3), this perhaps suggests that cachefiles's directory layout may be
part of the problem.  Running the following:

ls -ldSr `find . -type d`

in /var/fscache/cache shows that the directories are either 4096 bytes in size
(158 instances) or 12288 bytes in size (105 instances), for a total of 263
directories.  There are 19255 files.

Running that ls command in /warthog/aaa shows 1185 directories, all but three
of them 4096 bytes in size; two are 12288 bytes and one is 20480 bytes in size
(include/linux/ unsurprisingly).  There are 19258 files, three of which are
hardlinks to other files in the tree.

> This could be easily tested by running a test against a server that is the
> same as the client, and does not have the files in memory.  If local access
> is still slower than network then there is a real issue with cache
> efficiency.

My server is also my desktop machine.  The only way to guarantee that the
memory is scrubbed is to reboot it:-(  I'll look at setting up one of my other
machines as an NFS server.

David
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/37] Permit filesystem local caching

2008-02-22 Thread David Howells

Daniel Phillips <[EMAIL PROTECTED]> wrote:

> I am eventually going to suggest cutting the backing filesystem entirely out
> of the picture,

You still need a database to manage the cache.  A filesystem such as Ext3
makes a very handy database for four reasons:

 (1) It exists and works.

 (2) It has a well defined interface within the kernel.

 (3) I can place my cache on, say, my root partition on my laptop.  I don't
 have to dedicate a partition to the cache.

 (4) Userspace cache management tools (such as cachefilesd) have an already
 existing interface to use: rmdir, unlink, open, getdents, etc..

I do have a cache-on-blockdev thing, but it's basically a wandering tree
filesystem inside.  It is, or was, much faster than ext3 on a clean cache, but
it degrades horribly over time because my free space reclamation sucks - it
gradually randomises the block allocation sequence over time.

So, what would you suggest instead of a backing filesystem?

> I really do not like idea of force fitting this cache into a generic
> vfs model.  Sun was collectively smoking some serious crack when they
> cooked that one up.  But there is also the ageless principle "isness is
> more important than niceness".

What do you mean?  I'm not doing it like Sun.  The cache is a side path from
the netfs.  It should be transparent to the user, the VFS and the server.

The only place it might not be transparent is that you might to have to
instruct the netfs mount to use the cache.  I'd prefer to do it some other way
than passing parameters to mount, though, as (1) this causes fun with NIS
distributed automounter maps, and (2) people are asking for a finer grain of
control than per-mountpoint.  Unfortunately, I can't seem to find a way to do
it that's acceptable to Al.

> Which would require a change to NFS, not an option because you hope to
> work with standard servers?  Of course with years to think about this,
> the required protocol changes were put into v4.  Not.

I don't think there's much I can do about NFS.  It requires the filesystem
from which the NFS server is dealing to have inode uniquifiers, which are then
incorporated into the file handle.  I don't think the NFS protocol itself
needs to change to support this.

> Have you completely exhausted optimization ideas for the file handle
> lookup?

No, but there aren't many.  CacheFiles doesn't actually do very much, and it's
hard to reduce that not very much.  The most obvious thing is to prepopulate
the dcache, but that's at the expense of memory usage.

Actually, if I cache the name => FH mapping I used last time, I can make a
start on looking up in the cache whilst simultaneously accessing the server.
If what's on the server has changed, I can ditch the speculative cache lookup
I was making and start a new cache lookup.

However, storing directory entries has penalties of its own, though it'll be
necesary if we want to do disconnected operation.

> > Where "lookup table" == "dcache".  That would be good yes.  cachefilesd
> > prescans all the files in the cache, which ought to do just that, but it
> > doesn't seem to be very effective.  I'm not sure why.
> 
> RCU?  Anyway, it is something to be tracked down and put right.

cachefilesd runs in userspace.  It's possible it isn't doing enough to preload
all the metadata.

> What I tried to say.  So still... got any ideas?  That extra synchronous
> network round trip is a killer.  Can it be made streaming/async to keep
> throughput healthy?

That's a per-netfs thing.  With the test rig I've got, it's going to the
on-disk cache that's the killer.  Going over the network is much faster.

See the results I posted.  For the tarball load, and using Ext3 to back the
cache:

Cold NFS cache, no disk cache:  0m22.734s
Warm on-disk cache, cold pagecaches:1m54.350s

The problem is reading using tar is a worst case workload for this.  Everything
it does is pretty much completely synchronous.

One thing that might help is if things like tar and find can be made to use
fadvise() on directories to hint to the filesystem (NFS, AFS, whatever) that
it's going to access every file in those directories.

Certainly AFS could make use of that: the directory is read as a file, and the
netfs then parses the file to get a list of vnode IDs that that directory
points to.  It could then do bulk status fetch operations to instantiate the
inodes 50 at a time.

I don't know whether NFS could use it.  Someone like Trond or SteveD or Chuck
would have to answer that.

David
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/37] Permit filesystem local caching

2008-02-22 Thread David Howells

Chris Mason <[EMAIL PROTECTED]> wrote:

> Thanks for trying this, of course I'll ask you to try again with the latest 
> v0.13 code, it has a number of optimizations especially for CPU usage.

Here you go.  The numbers are very similar.

David

=
FEW BIG FILES TEST ON BTRFS v0.13
=

Completely cold caches:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m2.202s
user0m0.000s
sys 0m1.716s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m4.212s
user0m0.000s
sys 0m0.896s

Warm BTRFS pagecache, cold NFS pagecache:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m0.197s
user0m0.000s
sys 0m0.192s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m0.376s
user0m0.000s
sys 0m0.372s

Warm on-disk cache, cold pagecaches:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m1.543s
user0m0.004s
sys 0m1.448s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m3.111s
user0m0.000s
sys 0m2.856s


==
MANY SMALL/MEDIUM FILE READING TEST ON BTRFS v0.13
==

Completely cold caches:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real0m31.575s
user0m0.176s
sys 0m6.316s

Warm BTRFS pagecache, cold NFS pagecache:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real0m16.081s
user0m0.164s
sys 0m5.528s

Warm on-disk cache, cold pagecaches:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real2m15.245s
user0m0.064s
sys 0m2.808s

-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/37] Permit filesystem local caching

2008-02-22 Thread David Howells

David Howells <[EMAIL PROTECTED]> wrote:

> > > Have you got before/after benchmark results?
> > 
> > See attached.
> 
> Attached here are results using BTRFS (patched so that it'll work at all)
> rather than Ext3 on the client on the partition backing the cache.

And here are XFS results.

Tuning XFS makes a *really* big difference for the lots of small/medium files
being tarred case.  However, in general BTRFS is much better.

David
---


=
FEW BIG FILES TEST ON XFS
=

Completely cold caches:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m2.286s
user0m0.000s
sys 0m1.828s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m4.228s
user0m0.000s
sys 0m1.360s

Warm NFS pagecache:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m0.058s
user0m0.000s
sys 0m0.060s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m0.122s
user0m0.000s
sys 0m0.120s

Warm XFS pagecache, cold NFS pagecache:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m0.181s
user0m0.000s
sys 0m0.180s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m1.034s
user0m0.000s
sys 0m0.404s

Warm on-disk cache, cold pagecaches:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m1.540s
user0m0.000s
sys 0m0.256s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m3.003s
user0m0.000s
sys 0m0.532s


==
MANY SMALL/MEDIUM FILE READING TEST ON XFS
==

Completely cold caches:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real4m56.827s
user0m0.180s
sys 0m6.668s

Warm NFS pagecache:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real0m15.084s
user0m0.212s
sys 0m5.008s

Warm XFS pagecache, cold NFS pagecache:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real0m13.547s
user0m0.220s
sys 0m5.652s

Warm on-disk cache, cold pagecaches:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real4m36.316s
user0m0.148s
sys 0m4.440s


===
MANY SMALL/MEDIUM FILE READING TEST ON AN OPTIMISED XFS
===

mkfs.xfs -d agcount=4 -l size=128m,version=2 /dev/sda6


Completely cold caches:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real3m44.033s
user0m0.248s
sys 0m6.632s

Warm on-disk cache, cold pagecaches:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real3m8.582s
user0m0.108s
sys 0m3.420s
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/37] Permit filesystem local caching

2008-02-22 Thread David Howells

Chris Mason <[EMAIL PROTECTED]> wrote:

> > The interesting case is where the disk cache is warm, but the pagecache is
> > cold (ie: just after a reboot after filling the caches).  Here, for the two
> > big files case, BTRFS appears quite a bit better than Ext3, showing a 21%
> > reduction in time for the smaller case and a 13% reduction for the larger
> > case.
> 
> I'm afraid I don't have a good handle on the filesystem operations that
> result from this workload.  Are we reading from the FS to fill the NFS page
> cache?

I'm not sure what you're asking.

When the cache is cold, we determine that we can't read from the cache very
quickly.  We then read data from the server and, in the background, create the
metadata in the cache and store the data to it (by copying netfs pages to
backingfs pages).

When the cache is warm, we read the data from the cache, copying the data from
the backingfs pages to the netfs pages.  We use bmap() to ascertain that there
is data to be read, otherwise we detect a hole and fallback to reading from
the server.

Looking up cache object involves a sequence of lookup() ops and getxattr() ops
on the backingfs.  Should an object not exist, we defer creation of that
object to a background thread and do lookups(), mkdirs() and setxattrs() and a
create() to manufacture the object.

We read data from an object by calling readpages() on the backingfs to bring
the data into the pagecache.  We monitor the PG_lock bits to find out when
each page is read or has completed with an error.

Writing pages to the cache is done completely in the background.
PG_fscache_write is set on a page when it is handed to fscache to storage,
then at some point a background thread wakes up and calls write_one_page() in
the backingfs to write that page to the cache file.  At the moment, this
copies the data into a backingfs page which is then marked PG_dirty, and the
VM writes it out in the usual way.

> > More surprising is that BTRFS performed significantly worse (15% increase
> > in time) in the case where the cache on disk was fully populated and then
> > the machine had been rebooted to clear the pagecaches.
> 
> Which FS operations are included here?  Finding all the files or just an 
> unmount?  Btrfs defrags metadata in the background, and unmount has to wait 
> for that defrag to finish.

BTRFS might not be doing any writing at all here - apart from local atimes
(used by cache culling), that is.

What it does have to do is lots of lookups, reads and getxattrs, all of which
are synchronous.

David
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 09/37] Security: Allow kernel services to override LSM settings for task actions

2008-02-22 Thread David Howells

Casey Schaufler <[EMAIL PROTECTED]> wrote:

> > +static int smack_task_kernel_act_as(struct task_struct *p,
> > +   struct task_security *sec, u32 secid)
> > +{
> > +   return -ENOTSUPP;
> > +}
> ...
> > +static int smack_task_create_files_as(struct task_struct *p,
> > + struct task_security *sec,
> > + struct inode *inode)
> > +{
> > +   return -ENOTSUPP;
> > +}
> 
> Hum. ENOTSUPP is not not very satisfying, is it? I will have to
> think on this a bit.

Sorry, I meant to ping you on this directly.  I'm not sure how to effect these
two functions for Smack.

> Except for the fact that the hooks don't do anything this
> looks fine. I'm not sure that I would want these hooks to
> do anything, it requires additional thought to determine if
> there is a good behavior for them.

Note that you won't be able to use CacheFiles with Smack if either of these
just returns an error.  This may also affect NFSd in the future too.

smack_task_create_files_as() is passed the label that new files created by
CacheFiles should be created with.

For smack_task_kernel_act_as(), it may be sufficient to set CAP_MAC_OVERRIDE in
the task_security struct and leave it as that.  It also may not be sufficient,
as NFSd may end up using this to set the subjective security label supplied by
the NFS client.  I don't know, though, whether Smack is going to be involved in
that passing labels over NFS.

David
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/37] Permit filesystem local caching

2008-02-22 Thread David Howells

Daniel Phillips <[EMAIL PROTECTED]> wrote:

> > The way the client works is like this:
> 
> Thanks for the excellent ascii art, that cleared up the confusion right
> away.

You know what they say about pictures... :-)

> > What are you trying to do exactly?  Are you actually playing with it, or
> > just looking at the numbers I've produced?
> 
> Trying to see if you are offering enough of a win to justify testing it,
> and if that works out, then going shopping for a bin of rotten vegetables
> to throw at your design, which I hope you will perceive as useful.

One thing that you have to remember: my test setup is pretty much the
worst-case for being appropriate for showing the need for caching to improve
performance.  There's a single client and a single server, they've got GigE
networking between them that has very little other load, and the server has
sufficient memory to hold the entire test data set.

> From the numbers you have posted I think you are missing some basic
> efficiencies that could take this design from the sorta-ok zone to wow!

Not really, it's just that this lashup could be considered designed to show
local caching in the worst light.

> But looking up the object in the cache should be nearly free - much less
> than a microsecond per block.

The problem is that you have to do a database lookup of some sort, possibly
involving several synchronous disk operations.

CacheFiles does a disk lookup by taking the key given to it by NFS, turning it
into a set of file or directory names, and doing a short pathwalk to the target
cache file.  Throwing in extra indices won't necessarily help.  What matters is
how quick the backing filesystem is at doing lookups.  As it turns out, Ext3 is
a fair bit better then BTRFS when the disk cache is cold.

> > The metadata problem is quite a tricky one since it increases with the
> > number of files you're dealing with.  As things stand in my patches, when
> > NFS, for example, wants to access a new inode, it first has to go to the
> > server to lookup the NFS file handle, and only then can it go to the cache
> > to find out if there's a matching object in the case.
> 
> So without the persistent cache it can omit the LOOKUP and just send the
> filehandle as part of the READ?

What 'it'?  Note that the get the filehandle, you have to do a LOOKUP op.  With
the cache, we could actually cache the results of lookups that we've done,
however, we don't know that the results are still valid without going to the
server:-/

AFS has a way around that - it versions its vnode (inode) IDs.

> > The reason my client going to my server is so quick is that the server has
> > the dcache and the pagecache preloaded, so that across-network lookup
> > operations are really, really quick, as compared to the synchronous
> > slogging of the local disk to find the cache object.
> 
> Doesn't that just mean you have to preload the lookup table for the
> persistent cache so you can determine whether you are caching the data
> for a filehandle without going to disk?

Where "lookup table" == "dcache".  That would be good yes.  cachefilesd
prescans all the files in the cache, which ought to do just that, but it
doesn't seem to be very effective.  I'm not sure why.

> > I can probably improve this a little by pre-loading the subindex
> > directories (hash tables) that I use to reduce the directory size in the
> > cache, but I don't know by how much.
> 
> Ah I should have read ahead.  I think the correct answer is "a lot".

Quite possibly.  It'll allow me to dispense with at least one fs lookup call
per cache object request call.

> Your big can-t-get-there-from-here is the round trip to the server to
> determine whether you should read from the local cache.  Got any ideas?

I'm not sure what you mean.  Your statement should probably read "... to
determine _what_ you should read from the local cache".

> And where is the Trond-meister in all of this?

Keeping quiet as far as I can tell.

David
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/37] Permit filesystem local caching

2008-02-21 Thread David Howells

Daniel Phillips <[EMAIL PROTECTED]> wrote:

> When you say Ext3 cache vs NFS cache is the first on the server and the 
> second on the client?

The filesystem on the server is pretty much irrelevant as long as (a) it
doesn't change, and (b) all the data is in memory on the server anyway.

The way the client works is like this:

+-+
| |   
|   NFS   |--+
| |  |
+-+  |   +--+ 
 |   |  | 
+-+  +-->|  | 
| |  |  |
|   AFS   |->| FS-Cache |
| |  |  |--+
+-+  +-->|  |  |
 |   |  |  |   +--+   +--+
+-+  |   +--+  |   |  |   |  |
| |  | +-->|  CacheFiles  |-->|  Ext3|
|  ISOFS  |--+ |  /var/cache  |   |  /dev/sda6   |
| |+--+   +--+
+-+

 (1) NFS, say, asks FS-Cache to store/retrieve data for it;

 (2) FS-Cache asks the cache backend, in this case CacheFiles to honour the
 operation;

 (3) CacheFiles 'opens' a file in a mounted filesystem, say Ext3, and does read
 and write operations of a sort on it;

 (4) Ext3 decides how the cache data is laid out on disk - CacheFiles just
 attempts to use one sparse file per netfs inode.

> I am trying to spot the numbers that show the sweet spot for this 
> optimization, without much success so far.

What are you trying to do exactly?  Are you actually playing with it, or just
looking at the numbers I've produced?

> Who is supposed to win big?  Is this mainly about reducing the load on 
> the server, or is the client supposed to win even with a lightly loaded 
> server?

These are difficult questions to answer.  The obvious answer to both is "it
depends", and the real answer to both is "it's a compromise".

Inserting a cache adds overhead: you have to look in the cache to see if your
objects are mirrored there, and then you have to look in the cache to see if
the data you want is stored there; and then you might have to go to the server
anyway and then schedule a copy to be stored in the cache.

The characteristics of this type of cache depend on a number of things: the
filesystem backing it being the most obvious variable, but also how fragmented
it is and the properties of the disk drive or drives it is on.

Whether it's worth having a cache depend on the characteristics of the network
versus the characteristics of the cache.  Latency of the cache vs latency of
the network, for example.  Network loading is another: having a cache on each
of several clients sharing a server can reduce network traffic by avoiding the
read requests to the server.  NFS has a characteristic that it keeps spamming
the server with file status requests, so even if you take the read requests out
of the load, an NFS client still generates quite a lot of network traffic to
the server - but the reduction is still useful.

The metadata problem is quite a tricky one since it increases with the number
of files you're dealing with.  As things stand in my patches, when NFS, for
example, wants to access a new inode, it first has to go to the server to
lookup the NFS file handle, and only then can it go to the cache to find out if
there's a matching object in the case.  Worse, the cache must then perform
several synchronous disk bound metadata operations before it can be possible to
read from the cache.  Worse still, this means that a read on the network file
cannot proceed until (a) we've been to the server *plus* (b) we've been to the
disk.

The reason my client going to my server is so quick is that the server has the
dcache and the pagecache preloaded, so that across-network lookup operations
are really, really quick, as compared to the synchronous slogging of the local
disk to find the cache object.

I can probably improve this a little by pre-loading the subindex directories
(hash tables) that I use to reduce the directory size in the cache, but I don't
know by how much.

Anyway, to answer your questions:

 (1) It may help with heavily loaded networks with lots of read-only traffic.

 (2) It may help with slow connections (like doing NFS between the UK and
 Australia).

 (3) It could be used to do offline/disconnected operation.

David
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/37] Permit filesystem local caching

2008-02-21 Thread David Howells

David Howells <[EMAIL PROTECTED]> wrote:

> > Have you got before/after benchmark results?
> 
> See attached.

Attached here are results using BTRFS (patched so that it'll work at all)
rather than Ext3 on the client on the partition backing the cache.

Note that I didn't bother redoing the tests that didn't involve a cache as the
choice of filesystem backing the cache should have no bearing on the result.

Generally, completely cold caches shouldn't show much variation as all the
writing can be done completely asynchronously, provided the client doesn't
fill its RAM.

The interesting case is where the disk cache is warm, but the pagecache is
cold (ie: just after a reboot after filling the caches).  Here, for the two
big files case, BTRFS appears quite a bit better than Ext3, showing a 21%
reduction in time for the smaller case and a 13% reduction for the larger
case.

For the many small/medium files case, BTRFS performed significantly better
(15% reduction in time) in the case where the caches were completely cold.
I'm not sure why, though - perhaps because it doesn't execute a write_begin()
stage during the write_one_page() call and thus doesn't go allocating disk
blocks to back the data, but instead allocates them later.

More surprising is that BTRFS performed significantly worse (15% increase in
time) in the case where the cache on disk was fully populated and then the
machine had been rebooted to clear the pagecaches.

It's important to note that I've only run each test once apiece, so the
numbers should be taken with a modicum of salt (bad statistics and all that).

David
---
===
FEW BIG FILES TEST ON BTRFS
===

Completely cold caches:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m2.124s
user0m0.000s
sys 0m1.260s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m4.538s
user0m0.000s
sys 0m2.624s

Warm NFS pagecache:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m0.061s
user0m0.000s
sys 0m0.064s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m0.118s
user0m0.000s
sys 0m0.116s

Warm BTRFS pagecache, cold NFS pagecache:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m0.189s
user0m0.000s
sys 0m0.188s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m0.369s
user0m0.000s
sys 0m0.368s

Warm on-disk cache, cold pagecaches:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m1.540s
user0m0.000s
sys 0m1.440s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m3.132s
user0m0.000s
sys 0m1.724s

MANY SMALL/MEDIUM FILE READING TEST ON BTRFS

Completely cold caches:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real0m31.838s
user0m0.192s
sys 0m6.076s

Warm NFS pagecache:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real0m14.841s
user0m0.148s
sys 0m4.988s

Warm BTRFS pagecache, cold NFS pagecache:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real0m16.773s
user0m0.148s
sys 0m5.512s

Warm on-disk cache, cold pagecaches:

[EMAIL PROTECTED] ~]# time tar cf - /warthog/aaa >/dev/zero
real2m12.527s
user0m0.080s
sys 0m2.908s

-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/37] Permit filesystem local caching

2008-02-21 Thread David Howells

Daniel Phillips <[EMAIL PROTECTED]> wrote:

> Have you got before/after benchmark results?

See attached.

These show a couple of things:

 (1) Dealing with lots of metadata slows things down a lot.  Note the result of
 looking and reading lots of small files with tar (the last result).  The
 NFS client has to both consult the NFS server *and* the cache.  Not only
 that, but any asynchronicity the cache may like to do is rendered
 ineffective by the fact tar wants to do a read on a file pretty much
 directly after opening it.

 (2) Getting metadata from the local disk fs is slower than pulling it across
 an unshared gigabit ethernet from a server that already has it in memory.

These points don't mean that fscache is no use, just that you have to consider
carefully whether it's of use to *you* given your particular situation, and
that depends on various factors.

Note that currently FS-Caching is disabled for individual NFS files opened for
writing as there's no way to handle the coherency problems thereby introduced.

David
---

  ===
  FS-CACHE FOR NFS BENCHMARKS
  ===

 (*) The NFS client has a 1.86GHz Core2 Duo CPU and 1GB of RAM.

 (*) The NFS client has a Seagate ST380211AS 80GB 7200rpm SATA disk on an
 interface running in AHCI mode.  The chipset is an Intel G965.

 (*) A partition of approx 4.5GB is committed to caching, and is formatted as
 Ext3 with a blocksize of 4096 and directory indices.

 (*) The NFS client is using SELinux.

 (*) The NFS server is running an in-kernel NFSd, and has a 2.66GHz Core2 Duo
 CPU and 6GB of RAM.  The chipset is an Intel P965.

 (*) The NFS client is connected to the NFS server by Gigabit Ethernet.

 (*) The NFS mount is made with defaults for all options not relating to the
 cache:

warthog:/warthog /warthog nfs
rw,vers=3,rsize=1048576,wsize=1048576,hard,proto=tcp,timeo=600,
retrans=2,sec=sys,fsc,addr=w.x.y.z 0 0


==
FEW BIG FILES TEST
==

Where:

 (*) The NFS server has two files:

[EMAIL PROTECTED] ~]# ls -l /warthog/bigfile
-rw-rw-r-- 1 4043 4043 104857600 2006-11-30 09:39 /warthog/bigfile
[EMAIL PROTECTED] ~]# ls -l /warthog/biggerfile 
-rw-rw-r-- 1 4043 4041 209715200 2006-03-21 13:56 /warthog/biggerfile

 Both of which are in memory on the server in all cases.


No patches, cold NFS cache:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m1.909s
user0m0.000s
sys 0m0.520s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m3.750s
user0m0.000s
sys 0m0.904s

CONFIG_FSCACHE=n, cold NFS cache:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m2.003s
user0m0.000s
sys 0m0.124s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m4.100s
user0m0.004s
sys 0m0.488s

Cold NFS cache, no disk cache:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m2.084s
user0m0.000s
sys 0m0.136s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m4.020s
user0m0.000s
sys 0m0.720s

Completely cold caches:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m2.412s
user0m0.000s
sys 0m0.892s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m4.449s
user0m0.000s
sys 0m2.300s

Warm NFS pagecache:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m0.067s
user0m0.000s
sys 0m0.064s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m0.133s
user0m0.000s
sys 0m0.136s

Warm Ext3 pagecache, cold NFS pagecache:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m0.173s
user0m0.000s
sys 0m0.172s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m0.316s
user0m0.000s
sys 0m0.316s

Warm on-disk cache, cold pagecaches:

[EMAIL PROTECTED] ~]# time cat /warthog/bigfile >/dev/null
real0m1.955s
user0m0.000s
sys 0m0.244s
[EMAIL PROTECTED] ~]# time cat /warthog/biggerfile >/dev/null
real0m3.596s
user0m0.000s
sys 0m0.460s


===
MANY SMALL/MEDIUM FILE READING TEST
===

Where:

 (*) The NFS server has an old kernel tree:

[EMAIL PROTECTED] ~]# du -s /warthog/aaa
347340  /warthog/aaa

Re: [PATCH 00/37] Permit filesystem local caching

2008-02-21 Thread David Howells

Daniel Phillips <[EMAIL PROTECTED]> wrote:

> > These patches add local caching for network filesystems such as NFS.
> 
> Have you got before/after benchmark results?

I need to get a new hard drive for my test machine before I can go and get
some more up to date benchmark results.  It does seem, however, that the I/O
error handling capabilities of FS-Cache work properly:-)

David
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/37] Permit filesystem local caching

2008-02-20 Thread David Howells

Serge E. Hallyn <[EMAIL PROTECTED]> wrote:

> Seems *really* weird that every time you send this, patch 6 doesn't seem
> to reach me in any of my mailboxes...  (did get it from the url
> you listed)

It's the largest of the patches, so that's not entirely surprising.  Hence why
I included the URL to the tarball also.

> I'm sorry if I miss where you explicitly state this, but is it safe to
> assume, as perusing the patches suggests, that
> 
>   1. tsk->sec never changes other than in task_alloc_security()?  

Correct.

>   2. tsk->act_as is only ever dereferenced from (a) current->

That ought to be correct.

>  except (b) in do_coredump?

Actually, do_coredump() only deals with current->act_as.

> (thereby carefully avoiding locking issues)

That's the idea.

> I'd still like to see some performance numbers.  Not to object to
> these patches, just to make sure there's no need to try and optimize
> more of the dereferences away when they're not needed.

I hope that the performance impact is minimal.  The kernel should spend very
little time looking at the security data.  I'll try and get some though.

> Oh, manually copied from patch 6, I see you have in the task_security
> struct definition:
> 
>   kernel_cap_tcap_bset;   /* ? */
> 
> That comment can be filled in with 'capability bounding set' (for this
> task and all its future descendents).

Thanks.

David
-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 37/37] NFS: Add mount options to enable local caching on NFS

2008-02-20 Thread David Howells

Add NFS mount options to allow the local caching support to be enabled.

The attached patch makes it possible for the NFS filesystem to be told to make
use of the network filesystem local caching service (FS-Cache).

To be able to use this, a recent nfsutils package is required.

There are three variant NFS mount options that can be added to a mount command
to control caching for a mount.  Only the last one specified takes effect:

 (*) Adding "fsc" will request caching.

 (*) Adding "fsc=" will request caching and also specify a uniquifier.

 (*) Adding "nofsc" will disable caching.

For example:

mount warthog:/ /a -o fsc


The cache of a particular superblock (NFS FSID) will be shared between all
mounts of that volume, provided they have the same connection parameters and
are not marked 'nosharecache'.

Where it is otherwise impossible to distinguish superblocks because all the
parameters are identical, but the 'nosharecache' option is supplied, a
uniquifying string must be supplied, else only the first mount will be
permitted to use the cache.

If there's a key collision, then the second mount will disable caching and give
a warning into the kernel log.


Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/nfs/client.c   |2 ++
 fs/nfs/internal.h |1 +
 fs/nfs/super.c|   25 +
 3 files changed, 28 insertions(+), 0 deletions(-)


diff --git a/fs/nfs/client.c b/fs/nfs/client.c
index d67d52f..8357f68 100644
--- a/fs/nfs/client.c
+++ b/fs/nfs/client.c
@@ -669,6 +669,7 @@ static int nfs_init_server(struct nfs_server *server,
 
/* Initialise the client representation from the mount data */
server->flags = data->flags & NFS_MOUNT_FLAGMASK;
+   server->options = data->options;
 
if (data->rsize)
server->rsize = nfs_block_size(data->rsize, NULL);
@@ -1056,6 +1057,7 @@ static int nfs4_init_server(struct nfs_server *server,
/* Initialise the client representation from the mount data */
server->flags = data->flags & NFS_MOUNT_FLAGMASK;
server->caps |= NFS_CAP_ATOMIC_OPEN;
+   server->options = data->options;
 
if (data->rsize)
server->rsize = nfs_block_size(data->rsize, NULL);
diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
index e49cb6e..f427b35 100644
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -38,6 +38,7 @@ struct nfs_parsed_mount_data {
int acregmin, acregmax,
acdirmin, acdirmax;
int namlen;
+   unsigned intoptions;
unsigned intbsize;
unsigned intauth_flavor_len;
rpc_authflavor_tauth_flavors[1];
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index 79c4abe..4c513c6 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -76,6 +76,7 @@ enum {
Opt_acl, Opt_noacl,
Opt_rdirplus, Opt_nordirplus,
Opt_sharecache, Opt_nosharecache,
+   Opt_fscache, Opt_nofscache,
 
/* Mount options that take integer arguments */
Opt_port,
@@ -92,6 +93,7 @@ enum {
/* Mount options that take string arguments */
Opt_sec, Opt_proto, Opt_mountproto, Opt_mounthost,
Opt_addr, Opt_mountaddr, Opt_clientaddr,
+   Opt_fscache_uniq,
 
/* Mount options that are ignored */
Opt_userspace, Opt_deprecated,
@@ -125,6 +127,9 @@ static match_table_t nfs_mount_option_tokens = {
{ Opt_nordirplus, "nordirplus" },
{ Opt_sharecache, "sharecache" },
{ Opt_nosharecache, "nosharecache" },
+   { Opt_fscache, "fsc" },
+   { Opt_fscache_uniq, "fsc=%s" },
+   { Opt_nofscache, "nofsc" },
 
{ Opt_port, "port=%u" },
{ Opt_rsize, "rsize=%u" },
@@ -486,6 +491,8 @@ static void nfs_show_mount_options(struct seq_file *m, 
struct nfs_server *nfss,
seq_printf(m, ",timeo=%lu", 10U * nfss->client->cl_timeout->to_initval 
/ HZ);
seq_printf(m, ",retrans=%u", nfss->client->cl_timeout->to_retries);
seq_printf(m, ",sec=%s", 
nfs_pseudoflavour_to_name(nfss->client->cl_auth->au_flavor));
+   if (nfss->options & NFS_OPTION_FSCACHE)
+   seq_printf(m, ",fsc");
 }
 
 /*
@@ -780,6 +787,24 @@ static int nfs_parse_mount_options(char *raw,
case Opt_nosharecache:
mnt->flags |= NFS_MOUNT_UNSHARED;
break;
+   case Opt_fscache:
+   mnt->options |= NFS_OPTION_FSCACHE;
+   kfree(mnt->fscache_uniq);
+   mnt->fscache_uniq = NULL;
+   break;
+   case Opt_nofscache:
+

[PATCH 32/37] NFS: Add read context retention for FS-Cache to call back with

2008-02-20 Thread David Howells

Add read context retention so that FS-Cache can call back into NFS when a read
operation on the cache fails EIO rather than reading data.  This permits NFS to
then fetch the data from the server instead using the appropriate security
context.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/nfs/fscache-index.c |   26 ++
 1 files changed, 26 insertions(+), 0 deletions(-)


diff --git a/fs/nfs/fscache-index.c b/fs/nfs/fscache-index.c
index eec8e7e..af9f06b 100644
--- a/fs/nfs/fscache-index.c
+++ b/fs/nfs/fscache-index.c
@@ -285,6 +285,30 @@ static void nfs_cache_inode_now_uncached(void 
*cookie_netfs_data)
 }
 
 /*
+ * Get an extra reference on a read context.
+ * - This function can be absent if the completion function doesn't require a
+ *   context.
+ * - The read context is passed back to NFS in the event that a data read on 
the
+ *   cache fails with EIO - in which case the server must be contacted to
+ *   retrieve the data, which requires the read context for security.
+ */
+static void nfs_fh_get_context(void *cookie_netfs_data, void *context)
+{
+   get_nfs_open_context(context);
+}
+
+/*
+ * Release an extra reference on a read context.
+ * - This function can be absent if the completion function doesn't require a
+ *   context.
+ */
+static void nfs_fh_put_context(void *cookie_netfs_data, void *context)
+{
+   if (context)
+   put_nfs_open_context(context);
+}
+
+/*
  * Define the inode object for FS-Cache.  This is used to describe an inode
  * object to fscache_acquire_cookie().  It is keyed by the NFS file handle for
  * an inode.
@@ -301,4 +325,6 @@ const struct fscache_cookie_def nfs_cache_inode_object_def 
= {
.get_aux= nfs_cache_inode_get_aux,
.check_aux  = nfs_cache_inode_check_aux,
.now_uncached   = nfs_cache_inode_now_uncached,
+   .get_context= nfs_fh_get_context,
+   .put_context= nfs_fh_put_context,
 };

-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 30/37] NFS: Add some new I/O event counters for FS-Cache events

2008-02-20 Thread David Howells

Add some new NFS I/O event counters for FS-Cache events.  They have to be
added as byte counters because I may need to be able to increase the numbers
by more than 1 at a time.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/nfs/iostat.h |7 +++
 1 files changed, 7 insertions(+), 0 deletions(-)


diff --git a/fs/nfs/iostat.h b/fs/nfs/iostat.h
index 6350ecb..0e3b170 100644
--- a/fs/nfs/iostat.h
+++ b/fs/nfs/iostat.h
@@ -60,6 +60,13 @@ enum nfs_stat_bytecounters {
NFSIOS_SERVERWRITTENBYTES,
NFSIOS_READPAGES,
NFSIOS_WRITEPAGES,
+#ifdef CONFIG_NFS_FSCACHE
+   NFSIOS_FSCACHE_READ_OK,
+   NFSIOS_FSCACHE_READ_FAIL,
+   NFSIOS_FSCACHE_WRITE_OK,
+   NFSIOS_FSCACHE_WRITE_FAIL,
+   NFSIOS_FSCACHE_UNCACHE,
+#endif
__NFSIOS_BYTESMAX,
 };
 

-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 35/37] NFS: Store pages from an NFS inode into a local cache

2008-02-20 Thread David Howells

Store pages from an NFS inode into the cache data storage object associated
with that inode.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/nfs/fscache.c |   26 ++
 fs/nfs/fscache.h |   16 
 fs/nfs/read.c|5 +
 3 files changed, 47 insertions(+), 0 deletions(-)


diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c
index 438cc9b..50ae70f 100644
--- a/fs/nfs/fscache.c
+++ b/fs/nfs/fscache.c
@@ -456,3 +456,29 @@ int __nfs_readpages_from_fscache(struct nfs_open_context 
*ctx,
 
return ret;
 }
+
+/*
+ * Store a newly fetched page in fscache
+ * - PG_fscache must be set on the page
+ */
+void __nfs_readpage_to_fscache(struct inode *inode, struct page *page, int 
sync)
+{
+   int ret;
+
+   dfprintk(FSCACHE,
+"NFS: readpage_to_fscache(fsc:%p/p:%p(i:%lx f:%lx)/%d)\n",
+NFS_I(inode)->fscache, page, page->index, page->flags, sync);
+
+   ret = fscache_write_page(NFS_I(inode)->fscache, page, GFP_KERNEL);
+   dfprintk(FSCACHE,
+"NFS: readpage_to_fscache: p:%p(i:%lu f:%lx) ret %d\n",
+page, page->index, page->flags, ret);
+
+   if (ret != 0) {
+   fscache_uncache_page(NFS_I(inode)->fscache, page);
+   nfs_add_stats(inode, NFSIOS_FSCACHE_WRITE_FAIL, 1);
+   nfs_add_stats(inode, NFSIOS_FSCACHE_UNCACHE, 1);
+   } else {
+   nfs_add_stats(inode, NFSIOS_FSCACHE_WRITE_OK, 1);
+   }
+}
diff --git a/fs/nfs/fscache.h b/fs/nfs/fscache.h
index 4c1e1a8..6264cd8 100644
--- a/fs/nfs/fscache.h
+++ b/fs/nfs/fscache.h
@@ -94,6 +94,7 @@ extern int __nfs_readpage_from_fscache(struct 
nfs_open_context *,
 extern int __nfs_readpages_from_fscache(struct nfs_open_context *,
struct inode *, struct address_space *,
struct list_head *, unsigned *);
+extern void __nfs_readpage_to_fscache(struct inode *, struct page *, int);
 
 /*
  * release the caching state associated with a page if undergoing complete page
@@ -133,6 +134,19 @@ static inline int nfs_readpages_from_fscache(struct 
nfs_open_context *ctx,
return -ENOBUFS;
 }
 
+/*
+ * Store a page newly fetched from the server in an inode data storage object
+ * in the cache.
+ */
+static inline void nfs_readpage_to_fscache(struct inode *inode,
+  struct page *page,
+  int sync)
+{
+   if (PageFsCache(page))
+   __nfs_readpage_to_fscache(inode, page, sync);
+}
+
+
 #else /* CONFIG_NFS_FSCACHE */
 static inline int nfs_fscache_register(void) { return 0; }
 static inline void nfs_fscache_unregister(void) {}
@@ -178,6 +192,8 @@ static inline int nfs_readpages_from_fscache(struct 
nfs_open_context *ctx,
 {
return -ENOBUFS;
 }
+static inline void nfs_readpage_to_fscache(struct inode *inode,
+  struct page *page, int sync) {}
 
 #endif /* CONFIG_NFS_FSCACHE */
 #endif /* _NFS_FSCACHE_H */
diff --git a/fs/nfs/read.c b/fs/nfs/read.c
index db27b26..e09bdf9 100644
--- a/fs/nfs/read.c
+++ b/fs/nfs/read.c
@@ -143,6 +143,11 @@ int nfs_readpage_async(struct nfs_open_context *ctx, 
struct inode *inode,
 
 static void nfs_readpage_release(struct nfs_page *req)
 {
+   struct inode *d_inode = req->wb_context->path.dentry->d_inode;
+
+   if (PageUptodate(req->wb_page))
+   nfs_readpage_to_fscache(d_inode, req->wb_page, 0);
+
unlock_page(req->wb_page);
 
dprintk("NFS: read done (%s/%Ld [EMAIL PROTECTED])\n",

-
To unsubscribe from this list: send the line "unsubscribe 
linux-security-module" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 17/37] CacheFiles: Add a hook to write a single page of data to an inode

2008-02-20 Thread David Howells

Add an address space operation to write one single page of data to an inode at
a page-aligned location (thus permitting the implementation to be highly
optimised).  The data source is a single page.

This is used by CacheFiles to store the contents of netfs pages into their
backing file pages.

Supply a generic implementation for this that uses the write_begin() and
write_end() address_space operations to bind a copy directly into the page
cache.

Hook the Ext2 and Ext3 operations to the generic implementation.

Signed-off-by: David Howells <[EMAIL PROTECTED]>
---

 fs/ext2/inode.c|2 ++
 fs/ext3/inode.c|3 +++
 include/linux/fs.h |7 ++
 mm/filemap.c   |   61 
 4 files changed, 73 insertions(+), 0 deletions(-)


diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
index c620068..f483014 100644
--- a/fs/ext2/inode.c
+++ b/fs/ext2/inode.c
@@ -792,6 +792,7 @@ const struct address_space_operations ext2_aops = {
.direct_IO  = ext2_direct_IO,
.writepages = ext2_writepages,
.migratepage= buffer_migrate_page,
+   .write_one_page = generic_file_buffered_write_one_page,
 };
 
 const struct address_space_operations ext2_aops_xip = {
@@ -810,6 +811,7 @@ const struct address_space_operations ext2_nobh_aops = {
.direct_IO  = ext2_direct_IO,
.writepages = ext2_writepages,
.migratepage= buffer_migrate_page,
+   .write_one_page = generic_file_buffered_write_one_page,
 };
 
 /*
diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
index c976123..0209f3b 100644
--- a/fs/ext3/inode.c
+++ b/fs/ext3/inode.c
@@ -1776,6 +1776,7 @@ static const struct address_space_operations 
ext3_ordered_aops = {
.releasepage= ext3_releasepage,
.direct_IO  = ext3_direct_IO,
.migratepage= buffer_migrate_page,
+   .write_one_page = generic_file_buffered_write_one_page,
 };
 
 static const struct address_space_operations ext3_writeback_aops = {
@@ -1790,6 +1791,7 @@ static const struct address_space_operations 
ext3_writeback_aops = {
.releasepage= ext3_releasepage,
.direct_IO  = ext3_direct_IO,
.migratepage= buffer_migrate_page,
+   .write_one_page = generic_file_buffered_write_one_page,
 };
 
 static const struct address_space_operations ext3_journalled_aops = {
@@ -1803,6 +1805,7 @@ static const struct address_space_operations 
ext3_journalled_aops = {
.bmap   = ext3_bmap,
.invalidatepage = ext3_invalidatepage,
.releasepage= ext3_releasepage,
+   .write_one_page = generic_file_buffered_write_one_page,
 };
 
 void ext3_set_aops(struct inode *inode)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index d218ef5..dd6c3d1 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -481,6 +481,11 @@ struct address_space_operations {
int (*migratepage) (struct address_space *,
struct page *, struct page *);
int (*launder_page) (struct page *);
+   /* write the contents of the source page over the page at the specified
+* index in the target address space (the source page does not need to
+* be related to the target address space) */
+   int (*write_one_page)(struct address_space *, pgoff_t, struct page *);
+
 };
 
 /*
@@ -1811,6 +1816,8 @@ extern ssize_t generic_file_direct_write(struct kiocb *, 
const struct iovec *,
unsigned long *, loff_t, loff_t *, size_t, size_t);
 extern ssize_t generic_file_buffered_write(struct kiocb *, const struct iovec 
*,
unsigned long, loff_t, loff_t *, size_t, ssize_t);
+extern int generic_file_buffered_write_one_page(struct address_space *,
+   pgoff_t, struct page *);
 extern ssize_t do_sync_read(struct file *filp, char __user *buf, size_t len, 
loff_t *ppos);
 extern ssize_t do_sync_write(struct file *filp, const char __user *buf, size_t 
len, loff_t *ppos);
 extern int generic_segment_checks(const struct iovec *iov,
diff --git a/mm/filemap.c b/mm/filemap.c
index df1e149..a583f44 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2359,6 +2359,67 @@ generic_file_buffered_write(struct kiocb *iocb, const 
struct iovec *iov,
 }
 EXPORT_SYMBOL(generic_file_buffered_write);
 
+/**
+ * generic_file_buffered_write_one_page - Write a single page of data to an
+ * inode
+ * @mapping - The address space of the target inode
+ * @index - The target page in the target inode to fill
+ * @source - The data to write into the target page
+ *
+ * Write the data from the source page to the page in the nominated address
+ * space at the @index specified.  Note that the file will not be extended if
+ * the page crosses the EOF marker, in which case only the first part of the
+ * page will be written.
+ *
+ * The @source page does not need to have any association wi

1 2 3 4 >

1 - 100 of 393 matches

Mail list logo