[RFC PATCH 5/7] crypto: n2 - remove ecb(arc4) support

2020-07-02 Thread Ard Biesheuvel
Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/n2_core.c | 46 
 1 file changed, 46 deletions(-)

diff --git a/drivers/crypto/n2_core.c b/drivers/crypto/n2_core.c
index 6a828bbecea4..c347e58cd9a1 100644
--- a/drivers/crypto/n2_core.c
+++ b/drivers/crypto/n2_core.c
@@ -662,7 +662,6 @@ struct n2_skcipher_context {
u8  aes[AES_MAX_KEY_SIZE];
u8  des[DES_KEY_SIZE];
u8  des3[3 * DES_KEY_SIZE];
-   u8  arc4[258]; /* S-box, X, Y */
} key;
 };
 
@@ -789,36 +788,6 @@ static int n2_3des_setkey(struct crypto_skcipher 
*skcipher, const u8 *key,
return 0;
 }
 
-static int n2_arc4_setkey(struct crypto_skcipher *skcipher, const u8 *key,
- unsigned int keylen)
-{
-   struct crypto_tfm *tfm = crypto_skcipher_tfm(skcipher);
-   struct n2_skcipher_context *ctx = crypto_tfm_ctx(tfm);
-   struct n2_skcipher_alg *n2alg = n2_skcipher_alg(skcipher);
-   u8 *s = ctx->key.arc4;
-   u8 *x = s + 256;
-   u8 *y = x + 1;
-   int i, j, k;
-
-   ctx->enc_type = n2alg->enc_type;
-
-   j = k = 0;
-   *x = 0;
-   *y = 0;
-   for (i = 0; i < 256; i++)
-   s[i] = i;
-   for (i = 0; i < 256; i++) {
-   u8 a = s[i];
-   j = (j + key[k] + a) & 0xff;
-   s[i] = s[j];
-   s[j] = a;
-   if (++k >= keylen)
-   k = 0;
-   }
-
-   return 0;
-}
-
 static inline int skcipher_descriptor_len(int nbytes, unsigned int block_size)
 {
int this_len = nbytes;
@@ -1122,21 +1091,6 @@ struct n2_skcipher_tmpl {
 };
 
 static const struct n2_skcipher_tmpl skcipher_tmpls[] = {
-   /* ARC4: only ECB is supported (chaining bits ignored) */
-   {   .name   = "ecb(arc4)",
-   .drv_name   = "ecb-arc4",
-   .block_size = 1,
-   .enc_type   = (ENC_TYPE_ALG_RC4_STREAM |
-  ENC_TYPE_CHAINING_ECB),
-   .skcipher   = {
-   .min_keysize= 1,
-   .max_keysize= 256,
-   .setkey = n2_arc4_setkey,
-   .encrypt= n2_encrypt_ecb,
-   .decrypt= n2_decrypt_ecb,
-   },
-   },
-
/* DES: ECB CBC and CFB are supported */
{   .name   = "ecb(des)",
.drv_name   = "ecb-des",
-- 
2.17.1



[RFC PATCH 3/7] SUNRPC: remove RC4-HMAC-MD5 support from KerberosV

2020-07-02 Thread Ard Biesheuvel
The RC4-HMAC-MD5 KerberosV algorithm is based on RFC 4757 [0], which
was specifically issued for interoperability with Windows 2000, but was
never intended to receive the same level of support. The RFC says

  The IETF Kerberos community supports publishing this specification as
  an informational document in order to describe this widely
  implemented technology.  However, while these encryption types
  provide the operations necessary to implement the base Kerberos
  specification [RFC4120], they do not provide all the required
  operations in the Kerberos cryptography framework [RFC3961].  As a
  result, it is not generally possible to implement potential
  extensions to Kerberos using these encryption types.  The Kerberos
  encryption type negotiation mechanism [RFC4537] provides one approach
  for using such extensions even when a Kerberos infrastructure uses
  long-term RC4 keys.  Because this specification does not implement
  operations required by RFC 3961 and because of security concerns with
  the use of RC4 and MD4 discussed in Section 8, this specification is
  not appropriate for publication on the standards track.

  The RC4-HMAC encryption types are used to ease upgrade of existing
  Windows NT environments, provide strong cryptography (128-bit key
  lengths), and provide exportable (meet United States government
  export restriction requirements) encryption.  This document describes
  the implementation of those encryption types.

Furthermore, this RFC was re-classified as 'historic' by RFC 8429 [1] in
2018, stating that 'none of the encryption types it specifies should be
used'

Note that other outdated algorithms are left in place (some of which are
guarded by CONFIG_SUNRPC_DISABLE_INSECURE_ENCTYPES), so this should only
adversely affect interoperability with Windows NT/2000 systems that have
not received any updates since 2008 (but are connected to a network
nonetheless)

[0] https://tools.ietf.org/html/rfc4757
[1] https://tools.ietf.org/html/rfc8429

Signed-off-by: Ard Biesheuvel 
---
 include/linux/sunrpc/gss_krb5.h  |  11 -
 include/linux/sunrpc/gss_krb5_enctypes.h |   9 +-
 net/sunrpc/Kconfig   |   1 -
 net/sunrpc/auth_gss/gss_krb5_crypto.c| 276 
 net/sunrpc/auth_gss/gss_krb5_mech.c  |  95 ---
 net/sunrpc/auth_gss/gss_krb5_seal.c  |   1 -
 net/sunrpc/auth_gss/gss_krb5_seqnum.c|  87 --
 net/sunrpc/auth_gss/gss_krb5_unseal.c|   1 -
 net/sunrpc/auth_gss/gss_krb5_wrap.c  |  65 +
 9 files changed, 16 insertions(+), 530 deletions(-)

diff --git a/include/linux/sunrpc/gss_krb5.h b/include/linux/sunrpc/gss_krb5.h
index e8f8ffe7448b..91f43d86879d 100644
--- a/include/linux/sunrpc/gss_krb5.h
+++ b/include/linux/sunrpc/gss_krb5.h
@@ -141,14 +141,12 @@ enum sgn_alg {
SGN_ALG_MD2_5 = 0x0001,
SGN_ALG_DES_MAC = 0x0002,
SGN_ALG_3 = 0x0003, /* not published */
-   SGN_ALG_HMAC_MD5 = 0x0011,  /* microsoft w2k; no support */
SGN_ALG_HMAC_SHA1_DES3_KD = 0x0004
 };
 enum seal_alg {
SEAL_ALG_NONE = 0x,
SEAL_ALG_DES = 0x,
SEAL_ALG_1 = 0x0001,/* not published */
-   SEAL_ALG_MICROSOFT_RC4 = 0x0010,/* microsoft w2k; no support */
SEAL_ALG_DES3KD = 0x0002
 };
 
@@ -316,14 +314,5 @@ gss_krb5_aes_decrypt(struct krb5_ctx *kctx, u32 offset, 
u32 len,
 struct xdr_buf *buf, u32 *plainoffset,
 u32 *plainlen);
 
-int
-krb5_rc4_setup_seq_key(struct krb5_ctx *kctx,
-  struct crypto_sync_skcipher *cipher,
-  unsigned char *cksum);
-
-int
-krb5_rc4_setup_enc_key(struct krb5_ctx *kctx,
-  struct crypto_sync_skcipher *cipher,
-  s32 seqnum);
 void
 gss_krb5_make_confounder(char *p, u32 conflen);
diff --git a/include/linux/sunrpc/gss_krb5_enctypes.h 
b/include/linux/sunrpc/gss_krb5_enctypes.h
index 981c89cef19d..87eea679d750 100644
--- a/include/linux/sunrpc/gss_krb5_enctypes.h
+++ b/include/linux/sunrpc/gss_krb5_enctypes.h
@@ -13,15 +13,13 @@
 #ifdef CONFIG_SUNRPC_DISABLE_INSECURE_ENCTYPES
 
 /*
- * NB: This list includes encryption types that were deprecated
- * by RFC 8429 (DES3_CBC_SHA1 and ARCFOUR_HMAC).
+ * NB: This list includes DES3_CBC_SHA1, which was deprecated by RFC 8429.
  *
  * ENCTYPE_AES256_CTS_HMAC_SHA1_96
  * ENCTYPE_AES128_CTS_HMAC_SHA1_96
  * ENCTYPE_DES3_CBC_SHA1
- * ENCTYPE_ARCFOUR_HMAC
  */
-#define KRB5_SUPPORTED_ENCTYPES "18,17,16,23"
+#define KRB5_SUPPORTED_ENCTYPES "18,17,16"
 
 #else  /* CONFIG_SUNRPC_DISABLE_INSECURE_ENCTYPES */
 
@@ -32,12 +30,11 @@
  * ENCTYPE_AES256_CTS_HMAC_SHA1_96
  * ENCTYPE_AES128_CTS_HMAC_SHA1_96
  * ENCTYPE_DES3_CBC_SHA1
- * ENCTYPE_ARCFOUR_HMAC
  * ENCTYPE_DES_CBC_MD5
  * ENCTYPE_DES_CBC_CRC
  * ENCTYPE_DES_CBC_MD4
  */
-#define KRB5_SUPPORTED_ENCTYPES "18,17,16,23,3,1,2"
+#define 

Re: [PATCH] crypto: caam - Remove broken arc4 support

2020-07-02 Thread Ard Biesheuvel
On Thu, 2 Jul 2020 at 09:56, Herbert Xu  wrote:
>
> On Thu, Jul 02, 2020 at 09:51:29AM +0200, Ard Biesheuvel wrote:
> >
> > I'll wait for the code to be posted (please put me on cc), but my
>
> Sure I will.
>
> > suspicion is that carrying opaque state like that is going to bite us
> > down the road.
>
> Well it's only going to be arc4 at first, where it's definitely
> an improvement over modifying the tfm context in encrypt/decrypt.
>

I agree that the current approach is flawed, but starting multiple
requests with the same state essentially comes down to IV reuse in a
stream cipher, which will cause it to fail catastrophically.
(ciphertext1 ^ ciphertext2 == plaintext1 ^ plaintext2)

I wonder if we should simply try to get rid of arc4 in the crypto API,
as you suggested. There are a couple of WEP implementations that could
be switched over to the library interface, and the KerberosV
implementation of RC4-HMAC(md5) was added for Windows 2000
compatibility based on RFC 4757 [0], which was deprecated by RFC 8429
[1], since Windows Domain Controllers running Windows Server 2008r2 or
later can use newer algorithms.

[0] https://tools.ietf.org/html/rfc4757
[1] https://tools.ietf.org/html/rfc8429


>
> On Thu, Jul 02, 2020 at 05:56:16PM +1000, Herbert Xu wrote:
> >
> > For XTS I haven't decided whether to go this way or not.  If it
> > does work out though we could even extend it to AEAD.
>
> But there is clearly a need for this functionality, and it's
> not just af_alg.  Have a look at net/sunrpc/auth_gss/gss_krb5_crypto.c,
> it has three versions of the same crypto code (arc4, cts, and
> everything else), in order to deal with continuing requests just
> like algif_skcipher.
>
> Perhaps at the end of this we could distill it down to just one.
>

I agree that there is a gap here.

Perhaps we can decouple ARC4 from the other cases? ARC4 is too quirky
and irrelevant to model this on top of, I think.


Re: [PATCH] crypto: caam - Remove broken arc4 support

2020-07-02 Thread Ard Biesheuvel
On Thu, 2 Jul 2020 at 09:45, Herbert Xu  wrote:
>
> On Thu, Jul 02, 2020 at 09:40:42AM +0200, Ard Biesheuvel wrote:
> >
> > I suppose you are looking into this for chaining algif_skipcher
> > requests, right? So in that case, the ARC4 state should really be
> > treated as an IV, which is owned by the caller, and not stored in
> > either the TFM or the skcipher request object.
>
> Yes I have considered this approach previously but it's just too
> messy.  What I'm trying to do now is to allow the state to be stored
> in the request object.  When combined with the proposed REQ_MORE
> flag, this should be sufficient.  It evens works on XTS.
>

But that requires the caller to reuse the same skcipher request object
when doing chaining, right?

Currently, we have no such requirement, and it would mean that the
request object's context struct should be aligned between different
implementations, e.g., when a driver ends up invoking a fallback for
the last N bytes of a chained request.

I'll wait for the code to be posted (please put me on cc), but my
suspicion is that carrying opaque state like that is going to bite us
down the road.


Re: [PATCH] crypto: caam - Remove broken arc4 support

2020-07-02 Thread Ard Biesheuvel
On Thu, 2 Jul 2020 at 09:32, Ard Biesheuvel  wrote:
>
> On Thu, 2 Jul 2020 at 09:27, Ard Biesheuvel  wrote:
> >
> > On Thu, 2 Jul 2020 at 06:36, Herbert Xu  wrote:
> > >
> > > The arc4 algorithm requires storing state in the request context
> > > in order to allow more than one encrypt/decrypt operation.  As this
> > > driver does not seem to do that, it means that using it for more
> > > than one operation is broken.
> > >
> > > Fixes: eaed71a44ad9 ("crypto: caam - add ecb(*) support")
> > > Signed-off-by: Herbert Xu 
> > >
> >
> > Acked-by: Ard Biesheuvel 
> >
> > All internal users of ecb(arc4) use sync skciphers, so this should
> > only affect user space.
> >
> > I do wonder if the others are doing any better - n2 and bcm iproc also
> > appear to keep the state in the TFM object, while I'd expect the
> > setkey() to be a simple memcpy(), and the initial state derivation to
> > be part of the encrypt flow, right?
> >
> > Maybe we should add a test for this to tcrypt, i.e., do setkey() once
> > and do two encryptions of the same input, and check whether we get
> > back the original data.
> >
>
> Actually, it seems the generic ecb(arc4) is broken as well in this regard.

This may be strictly true, but perhaps reusing the key is not such a
great idea to begin with, given the lack of an IV, so the fact that
skcipher::setkey() operates on the TFM and not the request simply does
not match the ARC4 model.

I suppose you are looking into this for chaining algif_skipcher
requests, right? So in that case, the ARC4 state should really be
treated as an IV, which is owned by the caller, and not stored in
either the TFM or the skcipher request object.


Re: [PATCH] crypto: caam - Remove broken arc4 support

2020-07-02 Thread Ard Biesheuvel
On Thu, 2 Jul 2020 at 09:27, Ard Biesheuvel  wrote:
>
> On Thu, 2 Jul 2020 at 06:36, Herbert Xu  wrote:
> >
> > The arc4 algorithm requires storing state in the request context
> > in order to allow more than one encrypt/decrypt operation.  As this
> > driver does not seem to do that, it means that using it for more
> > than one operation is broken.
> >
> > Fixes: eaed71a44ad9 ("crypto: caam - add ecb(*) support")
> > Signed-off-by: Herbert Xu 
> >
>
> Acked-by: Ard Biesheuvel 
>
> All internal users of ecb(arc4) use sync skciphers, so this should
> only affect user space.
>
> I do wonder if the others are doing any better - n2 and bcm iproc also
> appear to keep the state in the TFM object, while I'd expect the
> setkey() to be a simple memcpy(), and the initial state derivation to
> be part of the encrypt flow, right?
>
> Maybe we should add a test for this to tcrypt, i.e., do setkey() once
> and do two encryptions of the same input, and check whether we get
> back the original data.
>

Actually, it seems the generic ecb(arc4) is broken as well in this regard.


Re: [PATCH] crypto: caam - Remove broken arc4 support

2020-07-02 Thread Ard Biesheuvel
On Thu, 2 Jul 2020 at 06:36, Herbert Xu  wrote:
>
> The arc4 algorithm requires storing state in the request context
> in order to allow more than one encrypt/decrypt operation.  As this
> driver does not seem to do that, it means that using it for more
> than one operation is broken.
>
> Fixes: eaed71a44ad9 ("crypto: caam - add ecb(*) support")
> Signed-off-by: Herbert Xu 
>

Acked-by: Ard Biesheuvel 

All internal users of ecb(arc4) use sync skciphers, so this should
only affect user space.

I do wonder if the others are doing any better - n2 and bcm iproc also
appear to keep the state in the TFM object, while I'd expect the
setkey() to be a simple memcpy(), and the initial state derivation to
be part of the encrypt flow, right?

Maybe we should add a test for this to tcrypt, i.e., do setkey() once
and do two encryptions of the same input, and check whether we get
back the original data.


> diff --git a/drivers/crypto/caam/caamalg.c b/drivers/crypto/caam/caamalg.c
> index b2f9882bc010f..797bff9b93182 100644
> --- a/drivers/crypto/caam/caamalg.c
> +++ b/drivers/crypto/caam/caamalg.c
> @@ -810,12 +810,6 @@ static int ctr_skcipher_setkey(struct crypto_skcipher 
> *skcipher,
> return skcipher_setkey(skcipher, key, keylen, ctx1_iv_off);
>  }
>
> -static int arc4_skcipher_setkey(struct crypto_skcipher *skcipher,
> -   const u8 *key, unsigned int keylen)
> -{
> -   return skcipher_setkey(skcipher, key, keylen, 0);
> -}
> -
>  static int des_skcipher_setkey(struct crypto_skcipher *skcipher,
>const u8 *key, unsigned int keylen)
>  {
> @@ -1967,21 +1961,6 @@ static struct caam_skcipher_alg driver_algs[] = {
> },
> .caam.class1_alg_type = OP_ALG_ALGSEL_3DES | OP_ALG_AAI_ECB,
> },
> -   {
> -   .skcipher = {
> -   .base = {
> -   .cra_name = "ecb(arc4)",
> -   .cra_driver_name = "ecb-arc4-caam",
> -   .cra_blocksize = ARC4_BLOCK_SIZE,
> -   },
> -   .setkey = arc4_skcipher_setkey,
> -   .encrypt = skcipher_encrypt,
> -   .decrypt = skcipher_decrypt,
> -   .min_keysize = ARC4_MIN_KEY_SIZE,
> -   .max_keysize = ARC4_MAX_KEY_SIZE,
> -   },
> -   .caam.class1_alg_type = OP_ALG_ALGSEL_ARC4 | OP_ALG_AAI_ECB,
> -   },
>  };
>
>  static struct caam_aead_alg driver_aeads[] = {
> @@ -3457,7 +3436,6 @@ int caam_algapi_init(struct device *ctrldev)
> struct caam_drv_private *priv = dev_get_drvdata(ctrldev);
> int i = 0, err = 0;
> u32 aes_vid, aes_inst, des_inst, md_vid, md_inst, ccha_inst, 
> ptha_inst;
> -   u32 arc4_inst;
> unsigned int md_limit = SHA512_DIGEST_SIZE;
> bool registered = false, gcm_support;
>
> @@ -3477,8 +3455,6 @@ int caam_algapi_init(struct device *ctrldev)
>CHA_ID_LS_DES_SHIFT;
> aes_inst = cha_inst & CHA_ID_LS_AES_MASK;
> md_inst = (cha_inst & CHA_ID_LS_MD_MASK) >> 
> CHA_ID_LS_MD_SHIFT;
> -   arc4_inst = (cha_inst & CHA_ID_LS_ARC4_MASK) >>
> -   CHA_ID_LS_ARC4_SHIFT;
> ccha_inst = 0;
> ptha_inst = 0;
>
> @@ -3499,7 +3475,6 @@ int caam_algapi_init(struct device *ctrldev)
> md_inst = mdha & CHA_VER_NUM_MASK;
> ccha_inst = rd_reg32(&priv->ctrl->vreg.ccha) & 
> CHA_VER_NUM_MASK;
> ptha_inst = rd_reg32(&priv->ctrl->vreg.ptha) & 
> CHA_VER_NUM_MASK;
> -   arc4_inst = rd_reg32(&priv->ctrl->vreg.afha) & 
> CHA_VER_NUM_MASK;
>
> gcm_support = aesa & CHA_VER_MISC_AES_GCM;
> }
> @@ -3522,10 +3497,6 @@ int caam_algapi_init(struct device *ctrldev)
> if (!aes_inst && (alg_sel == OP_ALG_ALGSEL_AES))
> continue;
>
> -   /* Skip ARC4 algorithms if not supported by device */
> -   if (!arc4_inst && alg_sel == OP_ALG_ALGSEL_ARC4)
> -   continue;
> -
> /*
>  * Check support for AES modes not available
>  * on LP devices.
> diff --git a/drivers/crypto/caam/compat.h b/drivers/crypto/caam/compat.h
> index 60e2a54c19f11..c3c22a8de4c00 100644
> --- a/drivers/crypto/caam/compat.h
> +++ b/drivers/crypto/caam/compat.h
> @@ -43,7 +43,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> --
> Email: Herbert Xu 
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


[PATCH v3 10/13] crypto: picoxcell - permit asynchronous skcipher as fallback

2020-06-30 Thread Ard Biesheuvel
Even though the picoxcell driver implements asynchronous versions of
ecb(aes) and cbc(aes), the fallbacks it allocates are required to be
synchronous. Given that SIMD based software implementations are usually
asynchronous as well, even though they rarely complete asynchronously
(this typically only happens in cases where the request was made from
softirq context, while SIMD was already in use in the task context that
it interrupted), these implementations are disregarded, and either the
generic C version or another table based version implemented in assembler
is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/picoxcell_crypto.c | 38 +++-
 1 file changed, 22 insertions(+), 16 deletions(-)

diff --git a/drivers/crypto/picoxcell_crypto.c 
b/drivers/crypto/picoxcell_crypto.c
index 7384e91c8b32..13503e16ce1d 100644
--- a/drivers/crypto/picoxcell_crypto.c
+++ b/drivers/crypto/picoxcell_crypto.c
@@ -86,6 +86,7 @@ struct spacc_req {
dma_addr_t  src_addr, dst_addr;
struct spacc_ddt*src_ddt, *dst_ddt;
void(*complete)(struct spacc_req *req);
+   struct skcipher_request fallback_req;   // keep at the end
 };
 
 struct spacc_aead {
@@ -158,7 +159,7 @@ struct spacc_ablk_ctx {
 * The fallback cipher. If the operation can't be done in hardware,
 * fallback to a software version.
 */
-   struct crypto_sync_skcipher *sw_cipher;
+   struct crypto_skcipher  *sw_cipher;
 };
 
 /* AEAD cipher context. */
@@ -792,13 +793,13 @@ static int spacc_aes_setkey(struct crypto_skcipher 
*cipher, const u8 *key,
 * Set the fallback transform to use the same request flags as
 * the hardware transform.
 */
-   crypto_sync_skcipher_clear_flags(ctx->sw_cipher,
+   crypto_skcipher_clear_flags(ctx->sw_cipher,
CRYPTO_TFM_REQ_MASK);
-   crypto_sync_skcipher_set_flags(ctx->sw_cipher,
+   crypto_skcipher_set_flags(ctx->sw_cipher,
  cipher->base.crt_flags &
  CRYPTO_TFM_REQ_MASK);
 
-   err = crypto_sync_skcipher_setkey(ctx->sw_cipher, key, len);
+   err = crypto_skcipher_setkey(ctx->sw_cipher, key, len);
if (err)
goto sw_setkey_failed;
}
@@ -900,7 +901,7 @@ static int spacc_ablk_do_fallback(struct skcipher_request 
*req,
struct crypto_tfm *old_tfm =
crypto_skcipher_tfm(crypto_skcipher_reqtfm(req));
struct spacc_ablk_ctx *ctx = crypto_tfm_ctx(old_tfm);
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, ctx->sw_cipher);
+   struct spacc_req *dev_req = skcipher_request_ctx(req);
int err;
 
/*
@@ -908,13 +909,13 @@ static int spacc_ablk_do_fallback(struct skcipher_request 
*req,
 * the ciphering has completed, put the old transform back into the
 * request.
 */
-   skcipher_request_set_sync_tfm(subreq, ctx->sw_cipher);
-   skcipher_request_set_callback(subreq, req->base.flags, NULL, NULL);
-   skcipher_request_set_crypt(subreq, req->src, req->dst,
+   skcipher_request_set_tfm(&dev_req->fallback_req, ctx->sw_cipher);
+   skcipher_request_set_callback(&dev_req->fallback_req, req->base.flags,
+ req->base.complete, req->base.data);
+   skcipher_request_set_crypt(&dev_req->fallback_req, req->src, req->dst,
   req->cryptlen, req->iv);
-   err = is_encrypt ? crypto_skcipher_encrypt(subreq) :
-  crypto_skcipher_decrypt(subreq);
-   skcipher_request_zero(subreq);
+   err = is_encrypt ? crypto_skcipher_encrypt(&dev_req->fallback_req) :
+  crypto_skcipher_decrypt(&dev_req->fallback_req);
 
return err;
 }
@@ -1007,19 +1008,24 @@ static int spacc_ablk_init_tfm(struct crypto_skcipher 
*tfm)
ctx->generic.flags = spacc_alg->type;
ctx->generic.engine = engine;
if (alg->base.cra_flags & CRYPTO_ALG_NEED_FALLBACK) {
-   ctx->sw_cipher = crypto_alloc_sync_skcipher(
-   alg->base.cra_name, 0, CRYPTO_ALG_NEED_FALLBACK);
+   ctx->sw_cipher = crypto_alloc_skcipher(alg->base.cra_name, 0,
+  
CRYPTO_ALG_NEED_FALL

[PATCH v3 11/13] crypto: qce - permit asynchronous skcipher as fallback

2020-06-30 Thread Ard Biesheuvel
Even though the qce driver implements asynchronous versions of ecb(aes),
cbc(aes)and xts(aes), the fallbacks it allocates are required to be
synchronous. Given that SIMD based software implementations are usually
asynchronous as well, even though they rarely complete asynchronously
(this typically only happens in cases where the request was made from
softirq context, while SIMD was already in use in the task context that
it interrupted), these implementations are disregarded, and either the
generic C version or another table based version implemented in assembler
is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

While at it, remove the pointless memset() from qce_skcipher_init(), and
remove the unnecessary call to it from qce_skcipher_init_fallback().

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/qce/cipher.h   |  3 +-
 drivers/crypto/qce/skcipher.c | 42 ++--
 2 files changed, 24 insertions(+), 21 deletions(-)

diff --git a/drivers/crypto/qce/cipher.h b/drivers/crypto/qce/cipher.h
index 7770660bc853..cffa9fc628ff 100644
--- a/drivers/crypto/qce/cipher.h
+++ b/drivers/crypto/qce/cipher.h
@@ -14,7 +14,7 @@
 struct qce_cipher_ctx {
u8 enc_key[QCE_MAX_KEY_SIZE];
unsigned int enc_keylen;
-   struct crypto_sync_skcipher *fallback;
+   struct crypto_skcipher *fallback;
 };
 
 /**
@@ -43,6 +43,7 @@ struct qce_cipher_reqctx {
struct sg_table src_tbl;
struct scatterlist *src_sg;
unsigned int cryptlen;
+   struct skcipher_request fallback_req;   // keep at the end
 };
 
 static inline struct qce_alg_template *to_cipher_tmpl(struct crypto_skcipher 
*tfm)
diff --git a/drivers/crypto/qce/skcipher.c b/drivers/crypto/qce/skcipher.c
index 9412433f3b21..a8147381b774 100644
--- a/drivers/crypto/qce/skcipher.c
+++ b/drivers/crypto/qce/skcipher.c
@@ -178,7 +178,7 @@ static int qce_skcipher_setkey(struct crypto_skcipher 
*ablk, const u8 *key,
break;
}
 
-   ret = crypto_sync_skcipher_setkey(ctx->fallback, key, keylen);
+   ret = crypto_skcipher_setkey(ctx->fallback, key, keylen);
if (!ret)
ctx->enc_keylen = keylen;
return ret;
@@ -235,16 +235,15 @@ static int qce_skcipher_crypt(struct skcipher_request 
*req, int encrypt)
  req->cryptlen <= aes_sw_max_len) ||
 (IS_XTS(rctx->flags) && req->cryptlen > QCE_SECTOR_SIZE &&
  req->cryptlen % QCE_SECTOR_SIZE))) {
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, ctx->fallback);
-
-   skcipher_request_set_sync_tfm(subreq, ctx->fallback);
-   skcipher_request_set_callback(subreq, req->base.flags,
- NULL, NULL);
-   skcipher_request_set_crypt(subreq, req->src, req->dst,
-  req->cryptlen, req->iv);
-   ret = encrypt ? crypto_skcipher_encrypt(subreq) :
-   crypto_skcipher_decrypt(subreq);
-   skcipher_request_zero(subreq);
+   skcipher_request_set_tfm(&rctx->fallback_req, ctx->fallback);
+   skcipher_request_set_callback(&rctx->fallback_req,
+ req->base.flags,
+ req->base.complete,
+ req->base.data);
+   skcipher_request_set_crypt(&rctx->fallback_req, req->src,
+  req->dst, req->cryptlen, req->iv);
+   ret = encrypt ? crypto_skcipher_encrypt(&rctx->fallback_req) :
+   crypto_skcipher_decrypt(&rctx->fallback_req);
return ret;
}
 
@@ -263,10 +262,9 @@ static int qce_skcipher_decrypt(struct skcipher_request 
*req)
 
 static int qce_skcipher_init(struct crypto_skcipher *tfm)
 {
-   struct qce_cipher_ctx *ctx = crypto_skcipher_ctx(tfm);
-
-   memset(ctx, 0, sizeof(*ctx));
-   crypto_skcipher_set_reqsize(tfm, sizeof(struct qce_cipher_reqctx));
+   /* take the size without the fallback skcipher_request at the end */
+   crypto_skcipher_set_reqsize(tfm, offsetof(struct qce_cipher_reqctx,
+ fallback_req));
return 0;
 }
 
@@ -274,17 +272,21 @@ static int qce_skcipher_init_fallback(struct 
crypto_skcipher *tfm)
 {
struct qce_cipher_ctx *ctx = crypto_skcipher_ctx(tfm);
 
-   qce_skcipher_init(tfm);
-   ctx->fallback = 
crypto_alloc_sync_skcipher(crypto_tfm_alg_name(&tf

[PATCH v3 13/13] crypto: mediatek - use AES library for GCM key derivation

2020-06-30 Thread Ard Biesheuvel
The Mediatek accelerator driver calls into a dynamically allocated
skcipher of the ctr(aes) variety to perform GCM key derivation, which
involves AES encryption of a single block consisting of NUL bytes.

There is no point in using the skcipher API for this, so use the AES
library interface instead.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/Kconfig|  3 +-
 drivers/crypto/mediatek/mtk-aes.c | 63 +++-
 2 files changed, 9 insertions(+), 57 deletions(-)

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index 802b9ada4e9e..c8c3ebb248f8 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -756,10 +756,9 @@ config CRYPTO_DEV_ZYNQMP_AES
 config CRYPTO_DEV_MEDIATEK
tristate "MediaTek's EIP97 Cryptographic Engine driver"
depends on (ARM && ARCH_MEDIATEK) || COMPILE_TEST
-   select CRYPTO_AES
+   select CRYPTO_LIB_AES
select CRYPTO_AEAD
select CRYPTO_SKCIPHER
-   select CRYPTO_CTR
select CRYPTO_SHA1
select CRYPTO_SHA256
select CRYPTO_SHA512
diff --git a/drivers/crypto/mediatek/mtk-aes.c 
b/drivers/crypto/mediatek/mtk-aes.c
index 78d660d963e2..4ad3571ab6af 100644
--- a/drivers/crypto/mediatek/mtk-aes.c
+++ b/drivers/crypto/mediatek/mtk-aes.c
@@ -137,8 +137,6 @@ struct mtk_aes_gcm_ctx {
 
u32 authsize;
size_t textlen;
-
-   struct crypto_skcipher *ctr;
 };
 
 struct mtk_aes_drv {
@@ -996,17 +994,8 @@ static int mtk_aes_gcm_setkey(struct crypto_aead *aead, 
const u8 *key,
  u32 keylen)
 {
struct mtk_aes_base_ctx *ctx = crypto_aead_ctx(aead);
-   struct mtk_aes_gcm_ctx *gctx = mtk_aes_gcm_ctx_cast(ctx);
-   struct crypto_skcipher *ctr = gctx->ctr;
-   struct {
-   u32 hash[4];
-   u8 iv[8];
-
-   struct crypto_wait wait;
-
-   struct scatterlist sg[1];
-   struct skcipher_request req;
-   } *data;
+   u8 hash[AES_BLOCK_SIZE] __aligned(4) = {};
+   struct crypto_aes_ctx aes_ctx;
int err;
 
switch (keylen) {
@@ -1026,39 +1015,18 @@ static int mtk_aes_gcm_setkey(struct crypto_aead *aead, 
const u8 *key,
 
ctx->keylen = SIZE_IN_WORDS(keylen);
 
-   /* Same as crypto_gcm_setkey() from crypto/gcm.c */
-   crypto_skcipher_clear_flags(ctr, CRYPTO_TFM_REQ_MASK);
-   crypto_skcipher_set_flags(ctr, crypto_aead_get_flags(aead) &
- CRYPTO_TFM_REQ_MASK);
-   err = crypto_skcipher_setkey(ctr, key, keylen);
+   err = aes_expandkey(&aes_ctx, key, keylen);
if (err)
return err;
 
-   data = kzalloc(sizeof(*data) + crypto_skcipher_reqsize(ctr),
-  GFP_KERNEL);
-   if (!data)
-   return -ENOMEM;
-
-   crypto_init_wait(&data->wait);
-   sg_init_one(data->sg, &data->hash, AES_BLOCK_SIZE);
-   skcipher_request_set_tfm(&data->req, ctr);
-   skcipher_request_set_callback(&data->req, CRYPTO_TFM_REQ_MAY_SLEEP |
- CRYPTO_TFM_REQ_MAY_BACKLOG,
- crypto_req_done, &data->wait);
-   skcipher_request_set_crypt(&data->req, data->sg, data->sg,
-  AES_BLOCK_SIZE, data->iv);
-
-   err = crypto_wait_req(crypto_skcipher_encrypt(&data->req),
- &data->wait);
-   if (err)
-   goto out;
+   aes_encrypt(&aes_ctx, hash, hash);
+   memzero_explicit(&aes_ctx, sizeof(aes_ctx));
 
mtk_aes_write_state_le(ctx->key, (const u32 *)key, keylen);
-   mtk_aes_write_state_be(ctx->key + ctx->keylen, data->hash,
+   mtk_aes_write_state_be(ctx->key + ctx->keylen, (const u32 *)hash,
   AES_BLOCK_SIZE);
-out:
-   kzfree(data);
-   return err;
+
+   return 0;
 }
 
 static int mtk_aes_gcm_setauthsize(struct crypto_aead *aead,
@@ -1095,32 +1063,17 @@ static int mtk_aes_gcm_init(struct crypto_aead *aead)
 {
struct mtk_aes_gcm_ctx *ctx = crypto_aead_ctx(aead);
 
-   ctx->ctr = crypto_alloc_skcipher("ctr(aes)", 0,
-CRYPTO_ALG_ASYNC);
-   if (IS_ERR(ctx->ctr)) {
-   pr_err("Error allocating ctr(aes)\n");
-   return PTR_ERR(ctx->ctr);
-   }
-
crypto_aead_set_reqsize(aead, sizeof(struct mtk_aes_reqctx));
ctx->base.start = mtk_aes_gcm_start;
return 0;
 }
 
-static void mtk_aes_gcm_exit(struct crypto_aead *aead)
-{
-   struct mtk_aes_gcm_ctx *ctx = crypto_aead_ctx(aead);
-
-   crypto_free_skcipher(ctx->ctr);
-}
-
 static struct aead_alg aes_gcm_alg = {
.setkey = mtk_aes_gcm_setkey,
.setauthsize= mtk_aes_gcm_setauthsize,
.encrypt= mtk_a

[PATCH v3 09/13] crypto: mxs-dcp - permit asynchronous skcipher as fallback

2020-06-30 Thread Ard Biesheuvel
Even though the mxs-dcp driver implements asynchronous versions of
ecb(aes) and cbc(aes), the fallbacks it allocates are required to be
synchronous. Given that SIMD based software implementations are usually
asynchronous as well, even though they rarely complete asynchronously
(this typically only happens in cases where the request was made from
softirq context, while SIMD was already in use in the task context that
it interrupted), these implementations are disregarded, and either the
generic C version or another table based version implemented in assembler
is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/mxs-dcp.c | 33 ++--
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/drivers/crypto/mxs-dcp.c b/drivers/crypto/mxs-dcp.c
index d84530293036..909a7eb748e3 100644
--- a/drivers/crypto/mxs-dcp.c
+++ b/drivers/crypto/mxs-dcp.c
@@ -97,7 +97,7 @@ struct dcp_async_ctx {
unsigned inthot:1;
 
/* Crypto-specific context */
-   struct crypto_sync_skcipher *fallback;
+   struct crypto_skcipher  *fallback;
unsigned intkey_len;
uint8_t key[AES_KEYSIZE_128];
 };
@@ -105,6 +105,7 @@ struct dcp_async_ctx {
 struct dcp_aes_req_ctx {
unsigned intenc:1;
unsigned intecb:1;
+   struct skcipher_request fallback_req;   // keep at the end
 };
 
 struct dcp_sha_req_ctx {
@@ -426,21 +427,20 @@ static int dcp_chan_thread_aes(void *data)
 static int mxs_dcp_block_fallback(struct skcipher_request *req, int enc)
 {
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+   struct dcp_aes_req_ctx *rctx = skcipher_request_ctx(req);
struct dcp_async_ctx *ctx = crypto_skcipher_ctx(tfm);
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, ctx->fallback);
int ret;
 
-   skcipher_request_set_sync_tfm(subreq, ctx->fallback);
-   skcipher_request_set_callback(subreq, req->base.flags, NULL, NULL);
-   skcipher_request_set_crypt(subreq, req->src, req->dst,
+   skcipher_request_set_tfm(&rctx->fallback_req, ctx->fallback);
+   skcipher_request_set_callback(&rctx->fallback_req, req->base.flags,
+ req->base.complete, req->base.data);
+   skcipher_request_set_crypt(&rctx->fallback_req, req->src, req->dst,
   req->cryptlen, req->iv);
 
if (enc)
-   ret = crypto_skcipher_encrypt(subreq);
+   ret = crypto_skcipher_encrypt(&rctx->fallback_req);
else
-   ret = crypto_skcipher_decrypt(subreq);
-
-   skcipher_request_zero(subreq);
+   ret = crypto_skcipher_decrypt(&rctx->fallback_req);
 
return ret;
 }
@@ -510,24 +510,25 @@ static int mxs_dcp_aes_setkey(struct crypto_skcipher 
*tfm, const u8 *key,
 * but is supported by in-kernel software implementation, we use
 * software fallback.
 */
-   crypto_sync_skcipher_clear_flags(actx->fallback, CRYPTO_TFM_REQ_MASK);
-   crypto_sync_skcipher_set_flags(actx->fallback,
+   crypto_skcipher_clear_flags(actx->fallback, CRYPTO_TFM_REQ_MASK);
+   crypto_skcipher_set_flags(actx->fallback,
  tfm->base.crt_flags & CRYPTO_TFM_REQ_MASK);
-   return crypto_sync_skcipher_setkey(actx->fallback, key, len);
+   return crypto_skcipher_setkey(actx->fallback, key, len);
 }
 
 static int mxs_dcp_aes_fallback_init_tfm(struct crypto_skcipher *tfm)
 {
const char *name = crypto_tfm_alg_name(crypto_skcipher_tfm(tfm));
struct dcp_async_ctx *actx = crypto_skcipher_ctx(tfm);
-   struct crypto_sync_skcipher *blk;
+   struct crypto_skcipher *blk;
 
-   blk = crypto_alloc_sync_skcipher(name, 0, CRYPTO_ALG_NEED_FALLBACK);
+   blk = crypto_alloc_skcipher(name, 0, CRYPTO_ALG_NEED_FALLBACK);
if (IS_ERR(blk))
return PTR_ERR(blk);
 
actx->fallback = blk;
-   crypto_skcipher_set_reqsize(tfm, sizeof(struct dcp_aes_req_ctx));
+   crypto_skcipher_set_reqsize(tfm, sizeof(struct dcp_aes_req_ctx) +
+crypto_skcipher_reqsize(blk));
return 0;
 }
 
@@ -535,7 +536,7 @@ static void mxs_dcp_aes_fallback_exit_tfm(struct 
crypto_skcipher *tfm)
 {
struct dcp_async_ctx *actx = crypto_skcipher_ctx(tfm);
 
-   crypto_free_sync_skcipher(actx->fallback);
+   crypto_free_skcipher(actx->fallback);
 }
 
 /*
-- 
2.17.1



[PATCH v3 06/13] crypto: sun8i-ss - permit asynchronous skcipher as fallback

2020-06-30 Thread Ard Biesheuvel
Even though the sun8i-ss driver implements asynchronous versions of
ecb(aes) and cbc(aes), the fallbacks it allocates are required to be
synchronous. Given that SIMD based software implementations are usually
asynchronous as well, even though they rarely complete asynchronously
(this typically only happens in cases where the request was made from
softirq context, while SIMD was already in use in the task context that
it interrupted), these implementations are disregarded, and either the
generic C version or another table based version implemented in assembler
is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c | 39 ++--
 drivers/crypto/allwinner/sun8i-ss/sun8i-ss.h|  3 +-
 2 files changed, 22 insertions(+), 20 deletions(-)

diff --git a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c 
b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c
index c89cb2ee2496..7a131675a41c 100644
--- a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c
+++ b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c
@@ -73,7 +73,6 @@ static int sun8i_ss_cipher_fallback(struct skcipher_request 
*areq)
struct sun8i_cipher_req_ctx *rctx = skcipher_request_ctx(areq);
int err;
 
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, op->fallback_tfm);
 #ifdef CONFIG_CRYPTO_DEV_SUN8I_SS_DEBUG
struct skcipher_alg *alg = crypto_skcipher_alg(tfm);
struct sun8i_ss_alg_template *algt;
@@ -81,15 +80,15 @@ static int sun8i_ss_cipher_fallback(struct skcipher_request 
*areq)
algt = container_of(alg, struct sun8i_ss_alg_template, alg.skcipher);
algt->stat_fb++;
 #endif
-   skcipher_request_set_sync_tfm(subreq, op->fallback_tfm);
-   skcipher_request_set_callback(subreq, areq->base.flags, NULL, NULL);
-   skcipher_request_set_crypt(subreq, areq->src, areq->dst,
+   skcipher_request_set_tfm(&rctx->fallback_req, op->fallback_tfm);
+   skcipher_request_set_callback(&rctx->fallback_req, areq->base.flags,
+ areq->base.complete, areq->base.data);
+   skcipher_request_set_crypt(&rctx->fallback_req, areq->src, areq->dst,
   areq->cryptlen, areq->iv);
if (rctx->op_dir & SS_DECRYPTION)
-   err = crypto_skcipher_decrypt(subreq);
+   err = crypto_skcipher_decrypt(&rctx->fallback_req);
else
-   err = crypto_skcipher_encrypt(subreq);
-   skcipher_request_zero(subreq);
+   err = crypto_skcipher_encrypt(&rctx->fallback_req);
return err;
 }
 
@@ -334,18 +333,20 @@ int sun8i_ss_cipher_init(struct crypto_tfm *tfm)
algt = container_of(alg, struct sun8i_ss_alg_template, alg.skcipher);
op->ss = algt->ss;
 
-   sktfm->reqsize = sizeof(struct sun8i_cipher_req_ctx);
-
-   op->fallback_tfm = crypto_alloc_sync_skcipher(name, 0, 
CRYPTO_ALG_NEED_FALLBACK);
+   op->fallback_tfm = crypto_alloc_skcipher(name, 0, 
CRYPTO_ALG_NEED_FALLBACK);
if (IS_ERR(op->fallback_tfm)) {
dev_err(op->ss->dev, "ERROR: Cannot allocate fallback for %s 
%ld\n",
name, PTR_ERR(op->fallback_tfm));
return PTR_ERR(op->fallback_tfm);
}
 
+   sktfm->reqsize = sizeof(struct sun8i_cipher_req_ctx) +
+crypto_skcipher_reqsize(op->fallback_tfm);
+
+
dev_info(op->ss->dev, "Fallback for %s is %s\n",
 crypto_tfm_alg_driver_name(&sktfm->base),
-
crypto_tfm_alg_driver_name(crypto_skcipher_tfm(&op->fallback_tfm->base)));
+
crypto_tfm_alg_driver_name(crypto_skcipher_tfm(op->fallback_tfm)));
 
op->enginectx.op.do_one_request = sun8i_ss_handle_cipher_request;
op->enginectx.op.prepare_request = NULL;
@@ -359,7 +360,7 @@ int sun8i_ss_cipher_init(struct crypto_tfm *tfm)
 
return 0;
 error_pm:
-   crypto_free_sync_skcipher(op->fallback_tfm);
+   crypto_free_skcipher(op->fallback_tfm);
return err;
 }
 
@@ -371,7 +372,7 @@ void sun8i_ss_cipher_exit(struct crypto_tfm *tfm)
memzero_explicit(op->key, op->keylen);
kfree(op->key);
}
-   crypto_free_sync_skcipher(op->fallback_tfm);
+   crypto_free_skcipher(op->fallback_tfm);
pm_runtime_put_sync(op->ss->dev);
 }
 
@@ -401,10 +402,10 @@ int sun8i_ss_aes_setkey(struct crypto_skcipher *tfm, 
const u8

[PATCH v3 07/13] crypto: ccp - permit asynchronous skcipher as fallback

2020-06-30 Thread Ard Biesheuvel
Even though the ccp driver implements an asynchronous version of xts(aes),
the fallback it allocates is required to be synchronous. Given that SIMD
based software implementations are usually asynchronous as well, even
though they rarely complete asynchronously (this typically only happens
in cases where the request was made from softirq context, while SIMD was
already in use in the task context that it interrupted), these
implementations are disregarded, and either the generic C version or
another table based version implemented in assembler is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/ccp/ccp-crypto-aes-xts.c | 33 ++--
 drivers/crypto/ccp/ccp-crypto.h |  4 ++-
 2 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/drivers/crypto/ccp/ccp-crypto-aes-xts.c 
b/drivers/crypto/ccp/ccp-crypto-aes-xts.c
index 04b2517df955..959168a7ac59 100644
--- a/drivers/crypto/ccp/ccp-crypto-aes-xts.c
+++ b/drivers/crypto/ccp/ccp-crypto-aes-xts.c
@@ -98,7 +98,7 @@ static int ccp_aes_xts_setkey(struct crypto_skcipher *tfm, 
const u8 *key,
ctx->u.aes.key_len = key_len / 2;
sg_init_one(&ctx->u.aes.key_sg, ctx->u.aes.key, key_len);
 
-   return crypto_sync_skcipher_setkey(ctx->u.aes.tfm_skcipher, key, 
key_len);
+   return crypto_skcipher_setkey(ctx->u.aes.tfm_skcipher, key, key_len);
 }
 
 static int ccp_aes_xts_crypt(struct skcipher_request *req,
@@ -145,20 +145,19 @@ static int ccp_aes_xts_crypt(struct skcipher_request *req,
(ctx->u.aes.key_len != AES_KEYSIZE_256))
fallback = 1;
if (fallback) {
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq,
-  ctx->u.aes.tfm_skcipher);
-
/* Use the fallback to process the request for any
 * unsupported unit sizes or key sizes
 */
-   skcipher_request_set_sync_tfm(subreq, ctx->u.aes.tfm_skcipher);
-   skcipher_request_set_callback(subreq, req->base.flags,
- NULL, NULL);
-   skcipher_request_set_crypt(subreq, req->src, req->dst,
-  req->cryptlen, req->iv);
-   ret = encrypt ? crypto_skcipher_encrypt(subreq) :
-   crypto_skcipher_decrypt(subreq);
-   skcipher_request_zero(subreq);
+   skcipher_request_set_tfm(&rctx->fallback_req,
+ctx->u.aes.tfm_skcipher);
+   skcipher_request_set_callback(&rctx->fallback_req,
+ req->base.flags,
+ req->base.complete,
+ req->base.data);
+   skcipher_request_set_crypt(&rctx->fallback_req, req->src,
+  req->dst, req->cryptlen, req->iv);
+   ret = encrypt ? crypto_skcipher_encrypt(&rctx->fallback_req) :
+   crypto_skcipher_decrypt(&rctx->fallback_req);
return ret;
}
 
@@ -198,13 +197,12 @@ static int ccp_aes_xts_decrypt(struct skcipher_request 
*req)
 static int ccp_aes_xts_init_tfm(struct crypto_skcipher *tfm)
 {
struct ccp_ctx *ctx = crypto_skcipher_ctx(tfm);
-   struct crypto_sync_skcipher *fallback_tfm;
+   struct crypto_skcipher *fallback_tfm;
 
ctx->complete = ccp_aes_xts_complete;
ctx->u.aes.key_len = 0;
 
-   fallback_tfm = crypto_alloc_sync_skcipher("xts(aes)", 0,
-CRYPTO_ALG_ASYNC |
+   fallback_tfm = crypto_alloc_skcipher("xts(aes)", 0,
 CRYPTO_ALG_NEED_FALLBACK);
if (IS_ERR(fallback_tfm)) {
pr_warn("could not load fallback driver xts(aes)\n");
@@ -212,7 +210,8 @@ static int ccp_aes_xts_init_tfm(struct crypto_skcipher *tfm)
}
ctx->u.aes.tfm_skcipher = fallback_tfm;
 
-   crypto_skcipher_set_reqsize(tfm, sizeof(struct ccp_aes_req_ctx));
+   crypto_skcipher_set_reqsize(tfm, sizeof(struct ccp_aes_req_ctx) +
+crypto_skcipher_reqsize(fallback_tfm));
 
return 0;
 }
@@ -221,7 +220,7 @@ static void ccp_aes_xts_exit_tfm(struct crypto_skcipher 
*tfm)
 {
struct ccp_ctx *ctx = crypto_skcipher_ctx(tfm);
 
-   crypto_free_sync_skcipher(ctx->u.aes.tfm_skcipher);
+   crypto_fre

[PATCH v3 08/13] crypto: chelsio - permit asynchronous skcipher as fallback

2020-06-30 Thread Ard Biesheuvel
Even though the chelsio driver implements asynchronous versions of
cbc(aes) and xts(aes), the fallbacks it allocates are required to be
synchronous. Given that SIMD based software implementations are usually
asynchronous as well, even though they rarely complete asynchronously
(this typically only happens in cases where the request was made from
softirq context, while SIMD was already in use in the task context that
it interrupted), these implementations are disregarded, and either the
generic C version or another table based version implemented in assembler
is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/chelsio/chcr_algo.c   | 57 
 drivers/crypto/chelsio/chcr_crypto.h |  3 +-
 2 files changed, 25 insertions(+), 35 deletions(-)

diff --git a/drivers/crypto/chelsio/chcr_algo.c 
b/drivers/crypto/chelsio/chcr_algo.c
index 4c2553672b6f..a6625b90fb1a 100644
--- a/drivers/crypto/chelsio/chcr_algo.c
+++ b/drivers/crypto/chelsio/chcr_algo.c
@@ -690,26 +690,22 @@ static int chcr_sg_ent_in_wr(struct scatterlist *src,
return min(srclen, dstlen);
 }
 
-static int chcr_cipher_fallback(struct crypto_sync_skcipher *cipher,
-   u32 flags,
-   struct scatterlist *src,
-   struct scatterlist *dst,
-   unsigned int nbytes,
+static int chcr_cipher_fallback(struct crypto_skcipher *cipher,
+   struct skcipher_request *req,
u8 *iv,
unsigned short op_type)
 {
+   struct chcr_skcipher_req_ctx *reqctx = skcipher_request_ctx(req);
int err;
 
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, cipher);
-
-   skcipher_request_set_sync_tfm(subreq, cipher);
-   skcipher_request_set_callback(subreq, flags, NULL, NULL);
-   skcipher_request_set_crypt(subreq, src, dst,
-  nbytes, iv);
+   skcipher_request_set_tfm(&reqctx->fallback_req, cipher);
+   skcipher_request_set_callback(&reqctx->fallback_req, req->base.flags,
+ req->base.complete, req->base.data);
+   skcipher_request_set_crypt(&reqctx->fallback_req, req->src, req->dst,
+  req->cryptlen, iv);
 
-   err = op_type ? crypto_skcipher_decrypt(subreq) :
-   crypto_skcipher_encrypt(subreq);
-   skcipher_request_zero(subreq);
+   err = op_type ? crypto_skcipher_decrypt(&reqctx->fallback_req) :
+   crypto_skcipher_encrypt(&reqctx->fallback_req);
 
return err;
 
@@ -924,11 +920,11 @@ static int chcr_cipher_fallback_setkey(struct 
crypto_skcipher *cipher,
 {
struct ablk_ctx *ablkctx = ABLK_CTX(c_ctx(cipher));
 
-   crypto_sync_skcipher_clear_flags(ablkctx->sw_cipher,
+   crypto_skcipher_clear_flags(ablkctx->sw_cipher,
CRYPTO_TFM_REQ_MASK);
-   crypto_sync_skcipher_set_flags(ablkctx->sw_cipher,
+   crypto_skcipher_set_flags(ablkctx->sw_cipher,
cipher->base.crt_flags & CRYPTO_TFM_REQ_MASK);
-   return crypto_sync_skcipher_setkey(ablkctx->sw_cipher, key, keylen);
+   return crypto_skcipher_setkey(ablkctx->sw_cipher, key, keylen);
 }
 
 static int chcr_aes_cbc_setkey(struct crypto_skcipher *cipher,
@@ -1206,13 +1202,8 @@ static int chcr_handle_cipher_resp(struct 
skcipher_request *req,
  req);
memcpy(req->iv, reqctx->init_iv, IV);
atomic_inc(&adap->chcr_stats.fallback);
-   err = chcr_cipher_fallback(ablkctx->sw_cipher,
-req->base.flags,
-req->src,
-req->dst,
-req->cryptlen,
-req->iv,
-reqctx->op);
+   err = chcr_cipher_fallback(ablkctx->sw_cipher, req, req->iv,
+  reqctx->op);
goto complete;
}
 
@@ -1341,11 +1332,7 @@ static int process_cipher(struct skcipher_request *req,
chcr_cipher_dma_unmap(&ULD_CTX(c_ctx(tfm))->lldi.pdev->dev,
  req);
 fallback:   atomic_inc(&adap->chcr_stats.fallback);
-   err = chcr_cipher_fallback(ablkctx->sw_cipher,
- 

[PATCH v3 05/13] crypto: sun8i-ce - permit asynchronous skcipher as fallback

2020-06-30 Thread Ard Biesheuvel
Even though the sun8i-ce driver implements asynchronous versions of
ecb(aes) and cbc(aes), the fallbacks it allocates are required to be
synchronous. Given that SIMD based software implementations are usually
asynchronous as well, even though they rarely complete asynchronously
(this typically only happens in cases where the request was made from
softirq context, while SIMD was already in use in the task context that
it interrupted), these implementations are disregarded, and either the
generic C version or another table based version implemented in assembler
is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c | 41 ++--
 drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h|  3 +-
 2 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c 
b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
index a6abb701bfc6..82c99da24dfd 100644
--- a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
+++ b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
@@ -58,23 +58,20 @@ static int sun8i_ce_cipher_fallback(struct skcipher_request 
*areq)
 #ifdef CONFIG_CRYPTO_DEV_SUN8I_CE_DEBUG
struct skcipher_alg *alg = crypto_skcipher_alg(tfm);
struct sun8i_ce_alg_template *algt;
-#endif
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, op->fallback_tfm);
 
-#ifdef CONFIG_CRYPTO_DEV_SUN8I_CE_DEBUG
algt = container_of(alg, struct sun8i_ce_alg_template, alg.skcipher);
algt->stat_fb++;
 #endif
 
-   skcipher_request_set_sync_tfm(subreq, op->fallback_tfm);
-   skcipher_request_set_callback(subreq, areq->base.flags, NULL, NULL);
-   skcipher_request_set_crypt(subreq, areq->src, areq->dst,
+   skcipher_request_set_tfm(&rctx->fallback_req, op->fallback_tfm);
+   skcipher_request_set_callback(&rctx->fallback_req, areq->base.flags,
+ areq->base.complete, areq->base.data);
+   skcipher_request_set_crypt(&rctx->fallback_req, areq->src, areq->dst,
   areq->cryptlen, areq->iv);
if (rctx->op_dir & CE_DECRYPTION)
-   err = crypto_skcipher_decrypt(subreq);
+   err = crypto_skcipher_decrypt(&rctx->fallback_req);
else
-   err = crypto_skcipher_encrypt(subreq);
-   skcipher_request_zero(subreq);
+   err = crypto_skcipher_encrypt(&rctx->fallback_req);
return err;
 }
 
@@ -335,18 +332,20 @@ int sun8i_ce_cipher_init(struct crypto_tfm *tfm)
algt = container_of(alg, struct sun8i_ce_alg_template, alg.skcipher);
op->ce = algt->ce;
 
-   sktfm->reqsize = sizeof(struct sun8i_cipher_req_ctx);
-
-   op->fallback_tfm = crypto_alloc_sync_skcipher(name, 0, 
CRYPTO_ALG_NEED_FALLBACK);
+   op->fallback_tfm = crypto_alloc_skcipher(name, 0, 
CRYPTO_ALG_NEED_FALLBACK);
if (IS_ERR(op->fallback_tfm)) {
dev_err(op->ce->dev, "ERROR: Cannot allocate fallback for %s 
%ld\n",
name, PTR_ERR(op->fallback_tfm));
return PTR_ERR(op->fallback_tfm);
}
 
+   sktfm->reqsize = sizeof(struct sun8i_cipher_req_ctx) +
+crypto_skcipher_reqsize(op->fallback_tfm);
+
+
dev_info(op->ce->dev, "Fallback for %s is %s\n",
 crypto_tfm_alg_driver_name(&sktfm->base),
-
crypto_tfm_alg_driver_name(crypto_skcipher_tfm(&op->fallback_tfm->base)));
+
crypto_tfm_alg_driver_name(crypto_skcipher_tfm(op->fallback_tfm)));
 
op->enginectx.op.do_one_request = sun8i_ce_handle_cipher_request;
op->enginectx.op.prepare_request = NULL;
@@ -358,7 +357,7 @@ int sun8i_ce_cipher_init(struct crypto_tfm *tfm)
 
return 0;
 error_pm:
-   crypto_free_sync_skcipher(op->fallback_tfm);
+   crypto_free_skcipher(op->fallback_tfm);
return err;
 }
 
@@ -370,7 +369,7 @@ void sun8i_ce_cipher_exit(struct crypto_tfm *tfm)
memzero_explicit(op->key, op->keylen);
kfree(op->key);
}
-   crypto_free_sync_skcipher(op->fallback_tfm);
+   crypto_free_skcipher(op->fallback_tfm);
pm_runtime_put_sync_suspend(op->ce->dev);
 }
 
@@ -400,10 +399,10 @@ int sun8i_ce_aes_setkey(struct crypto_skcipher *tfm, 
const u8 *key,
if (!op->key)
return -ENOMEM;
 
-   crypto_sync_skcipher_clear_flags(op->f

[PATCH v3 12/13] crypto: sahara - permit asynchronous skcipher as fallback

2020-06-30 Thread Ard Biesheuvel
Even though the sahara driver implements asynchronous versions of
ecb(aes) and cbc(aes), the fallbacks it allocates are required to be
synchronous. Given that SIMD based software implementations are usually
asynchronous as well, even though they rarely complete asynchronously
(this typically only happens in cases where the request was made from
softirq context, while SIMD was already in use in the task context that
it interrupted), these implementations are disregarded, and either the
generic C version or another table based version implemented in assembler
is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/sahara.c | 96 +---
 1 file changed, 45 insertions(+), 51 deletions(-)

diff --git a/drivers/crypto/sahara.c b/drivers/crypto/sahara.c
index 466e30bd529c..0c8cb23ae708 100644
--- a/drivers/crypto/sahara.c
+++ b/drivers/crypto/sahara.c
@@ -146,11 +146,12 @@ struct sahara_ctx {
/* AES-specific context */
int keylen;
u8 key[AES_KEYSIZE_128];
-   struct crypto_sync_skcipher *fallback;
+   struct crypto_skcipher *fallback;
 };
 
 struct sahara_aes_reqctx {
unsigned long mode;
+   struct skcipher_request fallback_req;   // keep at the end
 };
 
 /*
@@ -617,10 +618,10 @@ static int sahara_aes_setkey(struct crypto_skcipher *tfm, 
const u8 *key,
/*
 * The requested key size is not supported by HW, do a fallback.
 */
-   crypto_sync_skcipher_clear_flags(ctx->fallback, CRYPTO_TFM_REQ_MASK);
-   crypto_sync_skcipher_set_flags(ctx->fallback, tfm->base.crt_flags &
+   crypto_skcipher_clear_flags(ctx->fallback, CRYPTO_TFM_REQ_MASK);
+   crypto_skcipher_set_flags(ctx->fallback, tfm->base.crt_flags &
 CRYPTO_TFM_REQ_MASK);
-   return crypto_sync_skcipher_setkey(ctx->fallback, key, keylen);
+   return crypto_skcipher_setkey(ctx->fallback, key, keylen);
 }
 
 static int sahara_aes_crypt(struct skcipher_request *req, unsigned long mode)
@@ -651,21 +652,19 @@ static int sahara_aes_crypt(struct skcipher_request *req, 
unsigned long mode)
 
 static int sahara_aes_ecb_encrypt(struct skcipher_request *req)
 {
+   struct sahara_aes_reqctx *rctx = skcipher_request_ctx(req);
struct sahara_ctx *ctx = crypto_skcipher_ctx(
crypto_skcipher_reqtfm(req));
-   int err;
 
if (unlikely(ctx->keylen != AES_KEYSIZE_128)) {
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, ctx->fallback);
-
-   skcipher_request_set_sync_tfm(subreq, ctx->fallback);
-   skcipher_request_set_callback(subreq, req->base.flags,
- NULL, NULL);
-   skcipher_request_set_crypt(subreq, req->src, req->dst,
-  req->cryptlen, req->iv);
-   err = crypto_skcipher_encrypt(subreq);
-   skcipher_request_zero(subreq);
-   return err;
+   skcipher_request_set_tfm(&rctx->fallback_req, ctx->fallback);
+   skcipher_request_set_callback(&rctx->fallback_req,
+ req->base.flags,
+ req->base.complete,
+ req->base.data);
+   skcipher_request_set_crypt(&rctx->fallback_req, req->src,
+  req->dst, req->cryptlen, req->iv);
+   return crypto_skcipher_encrypt(&rctx->fallback_req);
}
 
return sahara_aes_crypt(req, FLAGS_ENCRYPT);
@@ -673,21 +672,19 @@ static int sahara_aes_ecb_encrypt(struct skcipher_request 
*req)
 
 static int sahara_aes_ecb_decrypt(struct skcipher_request *req)
 {
+   struct sahara_aes_reqctx *rctx = skcipher_request_ctx(req);
struct sahara_ctx *ctx = crypto_skcipher_ctx(
crypto_skcipher_reqtfm(req));
-   int err;
 
if (unlikely(ctx->keylen != AES_KEYSIZE_128)) {
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, ctx->fallback);
-
-   skcipher_request_set_sync_tfm(subreq, ctx->fallback);
-   skcipher_request_set_callback(subreq, req->base.flags,
- NULL, NULL);
-   skcipher_request_set_crypt(subreq, req->src, req->dst,
-  req->cryptlen, req->iv);
-   err = crypto_skcipher_decrypt(subreq);
-   skcipher_request_zero(subreq);
- 

[PATCH v3 04/13] crypto: sun4i - permit asynchronous skcipher as fallback

2020-06-30 Thread Ard Biesheuvel
Even though the sun4i driver implements asynchronous versions of ecb(aes)
and cbc(aes), the fallbacks it allocates are required to be synchronous.
Given that SIMD based software implementations are usually asynchronous
as well, even though they rarely complete asynchronously (this typically
only happens in cases where the request was made from softirq context,
while SIMD was already in use in the task context that it interrupted),
these implementations are disregarded, and either the generic C version
or another table based version implemented in assembler is selected
instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c | 46 ++--
 drivers/crypto/allwinner/sun4i-ss/sun4i-ss.h|  3 +-
 2 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c 
b/drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c
index 7f22d305178e..b72de8939497 100644
--- a/drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c
+++ b/drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c
@@ -122,19 +122,17 @@ static int noinline_for_stack 
sun4i_ss_cipher_poll_fallback(struct skcipher_requ
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(areq);
struct sun4i_tfm_ctx *op = crypto_skcipher_ctx(tfm);
struct sun4i_cipher_req_ctx *ctx = skcipher_request_ctx(areq);
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, op->fallback_tfm);
int err;
 
-   skcipher_request_set_sync_tfm(subreq, op->fallback_tfm);
-   skcipher_request_set_callback(subreq, areq->base.flags, NULL,
- NULL);
-   skcipher_request_set_crypt(subreq, areq->src, areq->dst,
+   skcipher_request_set_tfm(&ctx->fallback_req, op->fallback_tfm);
+   skcipher_request_set_callback(&ctx->fallback_req, areq->base.flags,
+ areq->base.complete, areq->base.data);
+   skcipher_request_set_crypt(&ctx->fallback_req, areq->src, areq->dst,
   areq->cryptlen, areq->iv);
if (ctx->mode & SS_DECRYPTION)
-   err = crypto_skcipher_decrypt(subreq);
+   err = crypto_skcipher_decrypt(&ctx->fallback_req);
else
-   err = crypto_skcipher_encrypt(subreq);
-   skcipher_request_zero(subreq);
+   err = crypto_skcipher_encrypt(&ctx->fallback_req);
 
return err;
 }
@@ -494,23 +492,25 @@ int sun4i_ss_cipher_init(struct crypto_tfm *tfm)
alg.crypto.base);
op->ss = algt->ss;
 
-   crypto_skcipher_set_reqsize(__crypto_skcipher_cast(tfm),
-   sizeof(struct sun4i_cipher_req_ctx));
-
-   op->fallback_tfm = crypto_alloc_sync_skcipher(name, 0, 
CRYPTO_ALG_NEED_FALLBACK);
+   op->fallback_tfm = crypto_alloc_skcipher(name, 0, 
CRYPTO_ALG_NEED_FALLBACK);
if (IS_ERR(op->fallback_tfm)) {
dev_err(op->ss->dev, "ERROR: Cannot allocate fallback for %s 
%ld\n",
name, PTR_ERR(op->fallback_tfm));
return PTR_ERR(op->fallback_tfm);
}
 
+   crypto_skcipher_set_reqsize(__crypto_skcipher_cast(tfm),
+   sizeof(struct sun4i_cipher_req_ctx) +
+   crypto_skcipher_reqsize(op->fallback_tfm));
+
+
err = pm_runtime_get_sync(op->ss->dev);
if (err < 0)
goto error_pm;
 
return 0;
 error_pm:
-   crypto_free_sync_skcipher(op->fallback_tfm);
+   crypto_free_skcipher(op->fallback_tfm);
return err;
 }
 
@@ -518,7 +518,7 @@ void sun4i_ss_cipher_exit(struct crypto_tfm *tfm)
 {
struct sun4i_tfm_ctx *op = crypto_tfm_ctx(tfm);
 
-   crypto_free_sync_skcipher(op->fallback_tfm);
+   crypto_free_skcipher(op->fallback_tfm);
pm_runtime_put(op->ss->dev);
 }
 
@@ -546,10 +546,10 @@ int sun4i_ss_aes_setkey(struct crypto_skcipher *tfm, 
const u8 *key,
op->keylen = keylen;
memcpy(op->key, key, keylen);
 
-   crypto_sync_skcipher_clear_flags(op->fallback_tfm, CRYPTO_TFM_REQ_MASK);
-   crypto_sync_skcipher_set_flags(op->fallback_tfm, tfm->base.crt_flags & 
CRYPTO_TFM_REQ_MASK);
+   crypto_skcipher_clear_flags(op->fallback_tfm, CRYPTO_TFM_REQ_MASK);
+   crypto_skcipher_set_flags(op->fallback_tfm, tfm->base.crt_flags & 
CRYPTO_TFM_REQ_MASK);
 
-   return crypto_sync_skcipher_setkey(op->fa

[PATCH v3 03/13] crypto: omap-aes - permit asynchronous skcipher as fallback

2020-06-30 Thread Ard Biesheuvel
Even though the omap-aes driver implements asynchronous versions of
ecb(aes), cbc(aes) and ctr(aes), the fallbacks it allocates are required
to be synchronous. Given that SIMD based software implementations are
usually asynchronous as well, even though they rarely complete
asynchronously (this typically only happens in cases where the request was
made from softirq context, while SIMD was already in use in the task
context that it interrupted), these implementations are disregarded, and
either the generic C version or another table based version implemented in
assembler is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/omap-aes.c | 35 ++--
 drivers/crypto/omap-aes.h |  3 +-
 2 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index b5aff20c5900..25154b74dcc6 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -548,20 +548,18 @@ static int omap_aes_crypt(struct skcipher_request *req, 
unsigned long mode)
  !!(mode & FLAGS_CBC));
 
if (req->cryptlen < aes_fallback_sz) {
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, ctx->fallback);
-
-   skcipher_request_set_sync_tfm(subreq, ctx->fallback);
-   skcipher_request_set_callback(subreq, req->base.flags, NULL,
- NULL);
-   skcipher_request_set_crypt(subreq, req->src, req->dst,
-  req->cryptlen, req->iv);
+   skcipher_request_set_tfm(&rctx->fallback_req, ctx->fallback);
+   skcipher_request_set_callback(&rctx->fallback_req,
+ req->base.flags,
+ req->base.complete,
+ req->base.data);
+   skcipher_request_set_crypt(&rctx->fallback_req, req->src,
+  req->dst, req->cryptlen, req->iv);
 
if (mode & FLAGS_ENCRYPT)
-   ret = crypto_skcipher_encrypt(subreq);
+   ret = crypto_skcipher_encrypt(&rctx->fallback_req);
else
-   ret = crypto_skcipher_decrypt(subreq);
-
-   skcipher_request_zero(subreq);
+   ret = crypto_skcipher_decrypt(&rctx->fallback_req);
return ret;
}
dd = omap_aes_find_dev(rctx);
@@ -590,11 +588,11 @@ static int omap_aes_setkey(struct crypto_skcipher *tfm, 
const u8 *key,
memcpy(ctx->key, key, keylen);
ctx->keylen = keylen;
 
-   crypto_sync_skcipher_clear_flags(ctx->fallback, CRYPTO_TFM_REQ_MASK);
-   crypto_sync_skcipher_set_flags(ctx->fallback, tfm->base.crt_flags &
+   crypto_skcipher_clear_flags(ctx->fallback, CRYPTO_TFM_REQ_MASK);
+   crypto_skcipher_set_flags(ctx->fallback, tfm->base.crt_flags &
 CRYPTO_TFM_REQ_MASK);
 
-   ret = crypto_sync_skcipher_setkey(ctx->fallback, key, keylen);
+   ret = crypto_skcipher_setkey(ctx->fallback, key, keylen);
if (!ret)
return 0;
 
@@ -640,15 +638,16 @@ static int omap_aes_init_tfm(struct crypto_skcipher *tfm)
 {
const char *name = crypto_tfm_alg_name(&tfm->base);
struct omap_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
-   struct crypto_sync_skcipher *blk;
+   struct crypto_skcipher *blk;
 
-   blk = crypto_alloc_sync_skcipher(name, 0, CRYPTO_ALG_NEED_FALLBACK);
+   blk = crypto_alloc_skcipher(name, 0, CRYPTO_ALG_NEED_FALLBACK);
if (IS_ERR(blk))
return PTR_ERR(blk);
 
ctx->fallback = blk;
 
-   crypto_skcipher_set_reqsize(tfm, sizeof(struct omap_aes_reqctx));
+   crypto_skcipher_set_reqsize(tfm, sizeof(struct omap_aes_reqctx) +
+crypto_skcipher_reqsize(blk));
 
ctx->enginectx.op.prepare_request = omap_aes_prepare_req;
ctx->enginectx.op.unprepare_request = NULL;
@@ -662,7 +661,7 @@ static void omap_aes_exit_tfm(struct crypto_skcipher *tfm)
struct omap_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
 
if (ctx->fallback)
-   crypto_free_sync_skcipher(ctx->fallback);
+   crypto_free_skcipher(ctx->fallback);
 
ctx->fallback = NULL;
 }
diff --git a/drivers/crypto/omap-aes.h b/drivers/crypto/omap-aes.h
index 2d111bf906e1..23d073e87bb8 10064

[PATCH v3 02/13] crypto: amlogic-gxl - permit async skcipher as fallback

2020-06-30 Thread Ard Biesheuvel
Even though the amlogic-gxl driver implements asynchronous versions of
ecb(aes) and cbc(aes), the fallbacks it allocates are required to be
synchronous. Given that SIMD based software implementations are usually
asynchronous as well, even though they rarely complete asynchronously
(this typically only happens in cases where the request was made from
softirq context, while SIMD was already in use in the task context that
it interrupted), these implementations are disregarded, and either the
generic C version or another table based version implemented in assembler
is selected instead.

Since falling back to synchronous AES is not only a performance issue,
but potentially a security issue as well (due to the fact that table
based AES is not time invariant), let's fix this, by allocating an
ordinary skcipher as the fallback, and invoke it with the completion
routine that was given to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/amlogic/amlogic-gxl-cipher.c | 27 ++--
 drivers/crypto/amlogic/amlogic-gxl.h|  3 ++-
 2 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/crypto/amlogic/amlogic-gxl-cipher.c 
b/drivers/crypto/amlogic/amlogic-gxl-cipher.c
index 9819dd50fbad..5880b94dcb32 100644
--- a/drivers/crypto/amlogic/amlogic-gxl-cipher.c
+++ b/drivers/crypto/amlogic/amlogic-gxl-cipher.c
@@ -64,22 +64,20 @@ static int meson_cipher_do_fallback(struct skcipher_request 
*areq)
 #ifdef CONFIG_CRYPTO_DEV_AMLOGIC_GXL_DEBUG
struct skcipher_alg *alg = crypto_skcipher_alg(tfm);
struct meson_alg_template *algt;
-#endif
-   SYNC_SKCIPHER_REQUEST_ON_STACK(req, op->fallback_tfm);
 
-#ifdef CONFIG_CRYPTO_DEV_AMLOGIC_GXL_DEBUG
algt = container_of(alg, struct meson_alg_template, alg.skcipher);
algt->stat_fb++;
 #endif
-   skcipher_request_set_sync_tfm(req, op->fallback_tfm);
-   skcipher_request_set_callback(req, areq->base.flags, NULL, NULL);
-   skcipher_request_set_crypt(req, areq->src, areq->dst,
+   skcipher_request_set_tfm(&rctx->fallback_req, op->fallback_tfm);
+   skcipher_request_set_callback(&rctx->fallback_req, areq->base.flags,
+ areq->base.complete, areq->base.data);
+   skcipher_request_set_crypt(&rctx->fallback_req, areq->src, areq->dst,
   areq->cryptlen, areq->iv);
+
if (rctx->op_dir == MESON_DECRYPT)
-   err = crypto_skcipher_decrypt(req);
+   err = crypto_skcipher_decrypt(&rctx->fallback_req);
else
-   err = crypto_skcipher_encrypt(req);
-   skcipher_request_zero(req);
+   err = crypto_skcipher_encrypt(&rctx->fallback_req);
return err;
 }
 
@@ -321,15 +319,16 @@ int meson_cipher_init(struct crypto_tfm *tfm)
algt = container_of(alg, struct meson_alg_template, alg.skcipher);
op->mc = algt->mc;
 
-   sktfm->reqsize = sizeof(struct meson_cipher_req_ctx);
-
-   op->fallback_tfm = crypto_alloc_sync_skcipher(name, 0, 
CRYPTO_ALG_NEED_FALLBACK);
+   op->fallback_tfm = crypto_alloc_skcipher(name, 0, 
CRYPTO_ALG_NEED_FALLBACK);
if (IS_ERR(op->fallback_tfm)) {
dev_err(op->mc->dev, "ERROR: Cannot allocate fallback for %s 
%ld\n",
name, PTR_ERR(op->fallback_tfm));
return PTR_ERR(op->fallback_tfm);
}
 
+   sktfm->reqsize = sizeof(struct meson_cipher_req_ctx) +
+crypto_skcipher_reqsize(op->fallback_tfm);
+
op->enginectx.op.do_one_request = meson_handle_cipher_request;
op->enginectx.op.prepare_request = NULL;
op->enginectx.op.unprepare_request = NULL;
@@ -345,7 +344,7 @@ void meson_cipher_exit(struct crypto_tfm *tfm)
memzero_explicit(op->key, op->keylen);
kfree(op->key);
}
-   crypto_free_sync_skcipher(op->fallback_tfm);
+   crypto_free_skcipher(op->fallback_tfm);
 }
 
 int meson_aes_setkey(struct crypto_skcipher *tfm, const u8 *key,
@@ -377,5 +376,5 @@ int meson_aes_setkey(struct crypto_skcipher *tfm, const u8 
*key,
if (!op->key)
return -ENOMEM;
 
-   return crypto_sync_skcipher_setkey(op->fallback_tfm, key, keylen);
+   return crypto_skcipher_setkey(op->fallback_tfm, key, keylen);
 }
diff --git a/drivers/crypto/amlogic/amlogic-gxl.h 
b/drivers/crypto/amlogic/amlogic-gxl.h
index b7f2de91ab76..dc0f142324a3 100644
--- a/drivers/crypto/amlogic/amlogic-gxl.h
+++ b/drivers/crypto/amlogic/amlogic-gxl.h
@@ -109,6 +109,7 @@ struct meson_dev {
 struct meson_cipher_req_ctx {
u32 op_dir;
int flow;
+   struct skcipher_request fallback_req;   // keep at the end
 };
 
 /*
@@ -126,7 +127,7 @@ struct meson_cipher_tfm_ctx {
u32 keylen;
u32 keymode;
struct meson_dev *mc;
-   struct crypto_sync_skcipher *fallback_tfm;
+   struct crypto_skcipher *fallback_tfm;
 };
 
 /*
-- 
2.17.1



[PATCH v3 00/13] crypto: permit asynchronous skciphers as driver fallbacks

2020-06-30 Thread Ard Biesheuvel
The drivers for crypto accelerators in drivers/crypto all implement skciphers
of an asynchronous nature, given that they are backed by hardware DMA that
completes asynchronously wrt the execution flow.

However, in many cases, any fallbacks they allocate are limited to the
synchronous variety, which rules out the use of SIMD implementations of
AES in ECB, CBC and XTS modes, given that they are usually built on top
of the asynchronous SIMD helper, which queues requests for asynchronous
completion if they are issued from a context that does not permit the use
of the SIMD register file.

This may result in sub-optimal AES implementations to be selected as
fallbacks, or even less secure ones if the only synchronous alternative
is table based, and therefore not time invariant.

So switch all these cases over to the asynchronous API, by moving the
subrequest into the skcipher request context, and permitting it to
complete asynchronously via the caller provided completion function.

Patch #1 is not related, but touches the same driver as #2 so it is
included anyway. Patch #13 removes another sync skcipher allocation by
switching to the AES library interface.

Only OMAP and CCP were tested on actual hardware - the others are build
tested only.

v3:
- disregard the fallback skcipher_request when taking the request context size
  for TFMs that don't need the fallback at all (picoxcell, qce)
- fix error handling in fallback skcipher allocation and remove pointless
  memset()s (qce)

v2:
- address issue found by build robot in patch #7
- add patch #13
- rebase onto cryptodev/master

Cc: Corentin Labbe 
Cc: Herbert Xu 
Cc: "David S. Miller" 
Cc: Maxime Ripard 
Cc: Chen-Yu Tsai 
Cc: Tom Lendacky 
Cc: Ayush Sawal 
Cc: Vinay Kumar Yadav 
Cc: Rohit Maheshwari 
Cc: Shawn Guo 
Cc: Sascha Hauer 
Cc: Pengutronix Kernel Team 
Cc: Fabio Estevam 
Cc: NXP Linux Team 
Cc: Jamie Iles 
Cc: Eric Biggers 
Cc: Tero Kristo 
Cc: Matthias Brugger 

Ard Biesheuvel (13):
  crypto: amlogic-gxl - default to build as module
  crypto: amlogic-gxl - permit async skcipher as fallback
  crypto: omap-aes - permit asynchronous skcipher as fallback
  crypto: sun4i - permit asynchronous skcipher as fallback
  crypto: sun8i-ce - permit asynchronous skcipher as fallback
  crypto: sun8i-ss - permit asynchronous skcipher as fallback
  crypto: ccp - permit asynchronous skcipher as fallback
  crypto: chelsio - permit asynchronous skcipher as fallback
  crypto: mxs-dcp - permit asynchronous skcipher as fallback
  crypto: picoxcell - permit asynchronous skcipher as fallback
  crypto: qce - permit asynchronous skcipher as fallback
  crypto: sahara - permit asynchronous skcipher as fallback
  crypto: mediatek - use AES library for GCM key derivation

 drivers/crypto/Kconfig|  3 +-
 .../allwinner/sun4i-ss/sun4i-ss-cipher.c  | 46 -
 drivers/crypto/allwinner/sun4i-ss/sun4i-ss.h  |  3 +-
 .../allwinner/sun8i-ce/sun8i-ce-cipher.c  | 41 
 drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h  |  3 +-
 .../allwinner/sun8i-ss/sun8i-ss-cipher.c  | 39 
 drivers/crypto/allwinner/sun8i-ss/sun8i-ss.h  |  3 +-
 drivers/crypto/amlogic/Kconfig|  2 +-
 drivers/crypto/amlogic/amlogic-gxl-cipher.c   | 27 +++---
 drivers/crypto/amlogic/amlogic-gxl.h  |  3 +-
 drivers/crypto/ccp/ccp-crypto-aes-xts.c   | 33 ---
 drivers/crypto/ccp/ccp-crypto.h   |  4 +-
 drivers/crypto/chelsio/chcr_algo.c| 57 +--
 drivers/crypto/chelsio/chcr_crypto.h  |  3 +-
 drivers/crypto/mediatek/mtk-aes.c | 63 ++--
 drivers/crypto/mxs-dcp.c  | 33 +++
 drivers/crypto/omap-aes.c | 35 ---
 drivers/crypto/omap-aes.h |  3 +-
 drivers/crypto/picoxcell_crypto.c | 38 
 drivers/crypto/qce/cipher.h   |  3 +-
 drivers/crypto/qce/skcipher.c | 42 
 drivers/crypto/sahara.c   | 96 +--
 22 files changed, 265 insertions(+), 315 deletions(-)

-- 
2.17.1



[PATCH v3 01/13] crypto: amlogic-gxl - default to build as module

2020-06-30 Thread Ard Biesheuvel
The AmLogic GXL crypto accelerator driver is built into the kernel if
ARCH_MESON is set. However, given the single image policy of arm64, its
defconfig enables all platforms by default, and so ARCH_MESON is usually
enabled.

This means that the AmLogic driver causes the arm64 defconfig build to
pull in a huge chunk of the crypto stack as a builtin as well, which is
undesirable, so let's make the amlogic GXL driver default to 'm' instead.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/amlogic/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/amlogic/Kconfig b/drivers/crypto/amlogic/Kconfig
index cf9547602670..cf2c676a7093 100644
--- a/drivers/crypto/amlogic/Kconfig
+++ b/drivers/crypto/amlogic/Kconfig
@@ -1,7 +1,7 @@
 config CRYPTO_DEV_AMLOGIC_GXL
tristate "Support for amlogic cryptographic offloader"
depends on HAS_IOMEM
-   default y if ARCH_MESON
+   default m if ARCH_MESON
select CRYPTO_SKCIPHER
select CRYPTO_ENGINE
select CRYPTO_ECB
-- 
2.17.1



Re: [PATCH v2 4/4] crypto: qat - fallback for xts with 192 bit keys

2020-06-30 Thread Ard Biesheuvel
On Mon, 29 Jun 2020 at 19:05, Giovanni Cabiddu
 wrote:
>
> Thanks for your feedback Ard.
>
> On Fri, Jun 26, 2020 at 08:15:16PM +0200, Ard Biesheuvel wrote:
> > On Fri, 26 Jun 2020 at 10:04, Giovanni Cabiddu
> >  wrote:
> > >
> > > +static int qat_alg_skcipher_init_xts_tfm(struct crypto_skcipher *tfm)
> > > +{
> > > +   struct qat_alg_skcipher_ctx *ctx = crypto_skcipher_ctx(tfm);
> > > +   int reqsize;
> > > +
> > > +   ctx->ftfm = crypto_alloc_skcipher("xts(aes)", 0, 
> > > CRYPTO_ALG_ASYNC);
> >
> > Why are you only permitting synchronous fallbacks? If the logic above
> > is sound, and copies the base.complete and base.data fields as well,
> > the fallback can complete asynchronously without problems.
> > Note that SIMD s/w implementations of XTS(AES) are asynchronous as
> > well, as they use the crypto_simd helper which queues requests for
> > asynchronous completion if the context from which the request was
> > issued does not permit access to the SIMD register file (e.g., softirq
> > context on some architectures, if the interrupted context is also
> > using SIMD)
> I did it this way since I though I didn't have a way to test it with an
> asynchronous sw implementation.
> I changed this line to avoid masking the asynchronous implementations
> and test it by forcing simd.c to use always cryptd (don't know if there
> is a simpler way to do it).
>

This is exactly how I tested it in the past, but note that the
extended testing that Eric implemented will also run from a context
where SIMD is disabled artificially, and so you should be getting this
behavior in any case.

> Also, I added to the mask CRYPTO_ALG_NEED_FALLBACK so I don't get another
> implementation that requires a fallback.
>
> I'm going to send a v3.
>
> Regards,
>
> --
> Giovanni


[PATCH 5/5] crypto: arm/ghash - use variably sized key struct

2020-06-29 Thread Ard Biesheuvel
Of the two versions of GHASH that the ARM driver implements, only one
performs aggregation, and so the other one has no use for the powers
of H to be precomputed, or space to be allocated for them in the key
struct. So make the context size dependent on which version is being
selected, and while at it, use a static key to carry this decision,
and get rid of the function pointer.

Signed-off-by: Ard Biesheuvel 
---
 arch/arm/crypto/ghash-ce-glue.c | 51 +---
 1 file changed, 24 insertions(+), 27 deletions(-)

diff --git a/arch/arm/crypto/ghash-ce-glue.c b/arch/arm/crypto/ghash-ce-glue.c
index a00fd329255f..f13401f3e669 100644
--- a/arch/arm/crypto/ghash-ce-glue.c
+++ b/arch/arm/crypto/ghash-ce-glue.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 MODULE_DESCRIPTION("GHASH hash function using ARMv8 Crypto Extensions");
@@ -27,12 +28,8 @@ MODULE_ALIAS_CRYPTO("ghash");
 #define GHASH_DIGEST_SIZE  16
 
 struct ghash_key {
-   u64 h[2];
-   u64 h2[2];
-   u64 h3[2];
-   u64 h4[2];
-
be128   k;
+   u64 h[][2];
 };
 
 struct ghash_desc_ctx {
@@ -46,16 +43,12 @@ struct ghash_async_ctx {
 };
 
 asmlinkage void pmull_ghash_update_p64(int blocks, u64 dg[], const char *src,
-  struct ghash_key const *k,
-  const char *head);
+  u64 const h[][2], const char *head);
 
 asmlinkage void pmull_ghash_update_p8(int blocks, u64 dg[], const char *src,
- struct ghash_key const *k,
- const char *head);
+ u64 const h[][2], const char *head);
 
-static void (*pmull_ghash_update)(int blocks, u64 dg[], const char *src,
- struct ghash_key const *k,
- const char *head);
+static __ro_after_init DEFINE_STATIC_KEY_FALSE(use_p64);
 
 static int ghash_init(struct shash_desc *desc)
 {
@@ -70,7 +63,10 @@ static void ghash_do_update(int blocks, u64 dg[], const char 
*src,
 {
if (likely(crypto_simd_usable())) {
kernel_neon_begin();
-   pmull_ghash_update(blocks, dg, src, key, head);
+   if (static_branch_likely(&use_p64))
+   pmull_ghash_update_p64(blocks, dg, src, key->h, head);
+   else
+   pmull_ghash_update_p8(blocks, dg, src, key->h, head);
kernel_neon_end();
} else {
be128 dst = { cpu_to_be64(dg[1]), cpu_to_be64(dg[0]) };
@@ -161,25 +157,26 @@ static int ghash_setkey(struct crypto_shash *tfm,
const u8 *inkey, unsigned int keylen)
 {
struct ghash_key *key = crypto_shash_ctx(tfm);
-   be128 h;
 
if (keylen != GHASH_BLOCK_SIZE)
return -EINVAL;
 
/* needed for the fallback */
memcpy(&key->k, inkey, GHASH_BLOCK_SIZE);
-   ghash_reflect(key->h, &key->k);
+   ghash_reflect(key->h[0], &key->k);
 
-   h = key->k;
-   gf128mul_lle(&h, &key->k);
-   ghash_reflect(key->h2, &h);
+   if (static_branch_likely(&use_p64)) {
+   be128 h = key->k;
 
-   gf128mul_lle(&h, &key->k);
-   ghash_reflect(key->h3, &h);
+   gf128mul_lle(&h, &key->k);
+   ghash_reflect(key->h[1], &h);
 
-   gf128mul_lle(&h, &key->k);
-   ghash_reflect(key->h4, &h);
+   gf128mul_lle(&h, &key->k);
+   ghash_reflect(key->h[2], &h);
 
+   gf128mul_lle(&h, &key->k);
+   ghash_reflect(key->h[3], &h);
+   }
return 0;
 }
 
@@ -195,7 +192,7 @@ static struct shash_alg ghash_alg = {
.base.cra_driver_name   = "ghash-ce-sync",
.base.cra_priority  = 300 - 1,
.base.cra_blocksize = GHASH_BLOCK_SIZE,
-   .base.cra_ctxsize   = sizeof(struct ghash_key),
+   .base.cra_ctxsize   = sizeof(struct ghash_key) + sizeof(u64[2]),
.base.cra_module= THIS_MODULE,
 };
 
@@ -354,10 +351,10 @@ static int __init ghash_ce_mod_init(void)
if (!(elf_hwcap & HWCAP_NEON))
return -ENODEV;
 
-   if (elf_hwcap2 & HWCAP2_PMULL)
-   pmull_ghash_update = pmull_ghash_update_p64;
-   else
-   pmull_ghash_update = pmull_ghash_update_p8;
+   if (elf_hwcap2 & HWCAP2_PMULL) {
+   ghash_alg.base.cra_ctxsize += 3 * sizeof(u64[2]);
+   static_branch_enable(&use_p64);
+   }
 
err = crypto_register_shash(&ghash_alg);
if (err)
-- 
2.20.1



[PATCH 3/5] crypto: arm64/gcm - use variably sized key struct

2020-06-29 Thread Ard Biesheuvel
Now that the ghash and gcm drivers are split, we no longer need to allocate
a key struct for the former that carries powers of H that are only used by
the latter. Also, take this opportunity to clean up the code a little bit.

Signed-off-by: Ard Biesheuvel 
---
 arch/arm64/crypto/ghash-ce-glue.c | 49 +---
 1 file changed, 21 insertions(+), 28 deletions(-)

diff --git a/arch/arm64/crypto/ghash-ce-glue.c 
b/arch/arm64/crypto/ghash-ce-glue.c
index 921fa69b5ded..2ae95dcf648f 100644
--- a/arch/arm64/crypto/ghash-ce-glue.c
+++ b/arch/arm64/crypto/ghash-ce-glue.c
@@ -31,12 +31,8 @@ MODULE_ALIAS_CRYPTO("ghash");
 #define GCM_IV_SIZE12
 
 struct ghash_key {
-   u64 h[2];
-   u64 h2[2];
-   u64 h3[2];
-   u64 h4[2];
-
be128   k;
+   u64 h[][2];
 };
 
 struct ghash_desc_ctx {
@@ -51,22 +47,18 @@ struct gcm_aes_ctx {
 };
 
 asmlinkage void pmull_ghash_update_p64(int blocks, u64 dg[], const char *src,
-  struct ghash_key const *k,
-  const char *head);
+  u64 const h[][2], const char *head);
 
 asmlinkage void pmull_ghash_update_p8(int blocks, u64 dg[], const char *src,
- struct ghash_key const *k,
- const char *head);
+ u64 const h[][2], const char *head);
 
 asmlinkage void pmull_gcm_encrypt(int bytes, u8 dst[], const u8 src[],
- struct ghash_key const *k, u64 dg[],
- u8 ctr[], u32 const rk[], int rounds,
- u8 tag[]);
+ u64 const h[][2], u64 dg[], u8 ctr[],
+ u32 const rk[], int rounds, u8 tag[]);
 
 asmlinkage void pmull_gcm_decrypt(int bytes, u8 dst[], const u8 src[],
- struct ghash_key const *k, u64 dg[],
- u8 ctr[], u32 const rk[], int rounds,
- u8 tag[]);
+ u64 const h[][2], u64 dg[], u8 ctr[],
+ u32 const rk[], int rounds, u8 tag[]);
 
 static int ghash_init(struct shash_desc *desc)
 {
@@ -80,12 +72,12 @@ static void ghash_do_update(int blocks, u64 dg[], const 
char *src,
struct ghash_key *key, const char *head,
void (*simd_update)(int blocks, u64 dg[],
const char *src,
-   struct ghash_key const *k,
+   u64 const h[][2],
const char *head))
 {
if (likely(crypto_simd_usable() && simd_update)) {
kernel_neon_begin();
-   simd_update(blocks, dg, src, key, head);
+   simd_update(blocks, dg, src, key->h, head);
kernel_neon_end();
} else {
be128 dst = { cpu_to_be64(dg[1]), cpu_to_be64(dg[0]) };
@@ -195,7 +187,7 @@ static int ghash_setkey(struct crypto_shash *tfm,
/* needed for the fallback */
memcpy(&key->k, inkey, GHASH_BLOCK_SIZE);
 
-   ghash_reflect(key->h, &key->k);
+   ghash_reflect(key->h[0], &key->k);
return 0;
 }
 
@@ -204,7 +196,7 @@ static struct shash_alg ghash_alg = {
.base.cra_driver_name   = "ghash-neon",
.base.cra_priority  = 150,
.base.cra_blocksize = GHASH_BLOCK_SIZE,
-   .base.cra_ctxsize   = sizeof(struct ghash_key),
+   .base.cra_ctxsize   = sizeof(struct ghash_key) + sizeof(u64[2]),
.base.cra_module= THIS_MODULE,
 
.digestsize = GHASH_DIGEST_SIZE,
@@ -244,17 +236,17 @@ static int gcm_setkey(struct crypto_aead *tfm, const u8 
*inkey,
/* needed for the fallback */
memcpy(&ctx->ghash_key.k, key, GHASH_BLOCK_SIZE);
 
-   ghash_reflect(ctx->ghash_key.h, &ctx->ghash_key.k);
+   ghash_reflect(ctx->ghash_key.h[0], &ctx->ghash_key.k);
 
h = ctx->ghash_key.k;
gf128mul_lle(&h, &ctx->ghash_key.k);
-   ghash_reflect(ctx->ghash_key.h2, &h);
+   ghash_reflect(ctx->ghash_key.h[1], &h);
 
gf128mul_lle(&h, &ctx->ghash_key.k);
-   ghash_reflect(ctx->ghash_key.h3, &h);
+   ghash_reflect(ctx->ghash_key.h[2], &h);
 
gf128mul_lle(&h, &ctx->ghash_key.k);
-   ghash_reflect(ctx->ghash_key.h4, &h);
+   ghash_reflect(ctx->ghash_key.h[3], &h);
 
return 0;
 }
@@ -380,8 +372,8 @@ static int gcm_encrypt(struct aead_req

[PATCH 4/5] crypto: arm64/gcm - use inline helper to suppress indirect calls

2020-06-29 Thread Ard Biesheuvel
Introduce an inline wrapper for ghash_do_update() that incorporates
the indirect call to the asm routine that is passed as an argument,
and keep the non-SIMD fallback code out of line. This ensures that
all references to the function pointer are inlined where the address
is taken, removing the need for any indirect calls to begin with.

Signed-off-by: Ard Biesheuvel 
---
 arch/arm64/crypto/ghash-ce-glue.c | 85 +++-
 1 file changed, 46 insertions(+), 39 deletions(-)

diff --git a/arch/arm64/crypto/ghash-ce-glue.c 
b/arch/arm64/crypto/ghash-ce-glue.c
index 2ae95dcf648f..da1034867aaa 100644
--- a/arch/arm64/crypto/ghash-ce-glue.c
+++ b/arch/arm64/crypto/ghash-ce-glue.c
@@ -69,36 +69,43 @@ static int ghash_init(struct shash_desc *desc)
 }
 
 static void ghash_do_update(int blocks, u64 dg[], const char *src,
-   struct ghash_key *key, const char *head,
-   void (*simd_update)(int blocks, u64 dg[],
-   const char *src,
-   u64 const h[][2],
-   const char *head))
+   struct ghash_key *key, const char *head)
 {
-   if (likely(crypto_simd_usable() && simd_update)) {
+   be128 dst = { cpu_to_be64(dg[1]), cpu_to_be64(dg[0]) };
+
+   do {
+   const u8 *in = src;
+
+   if (head) {
+   in = head;
+   blocks++;
+   head = NULL;
+   } else {
+   src += GHASH_BLOCK_SIZE;
+   }
+
+   crypto_xor((u8 *)&dst, in, GHASH_BLOCK_SIZE);
+   gf128mul_lle(&dst, &key->k);
+   } while (--blocks);
+
+   dg[0] = be64_to_cpu(dst.b);
+   dg[1] = be64_to_cpu(dst.a);
+}
+
+static __always_inline
+void ghash_do_simd_update(int blocks, u64 dg[], const char *src,
+ struct ghash_key *key, const char *head,
+ void (*simd_update)(int blocks, u64 dg[],
+ const char *src,
+ u64 const h[][2],
+ const char *head))
+{
+   if (likely(crypto_simd_usable())) {
kernel_neon_begin();
simd_update(blocks, dg, src, key->h, head);
kernel_neon_end();
} else {
-   be128 dst = { cpu_to_be64(dg[1]), cpu_to_be64(dg[0]) };
-
-   do {
-   const u8 *in = src;
-
-   if (head) {
-   in = head;
-   blocks++;
-   head = NULL;
-   } else {
-   src += GHASH_BLOCK_SIZE;
-   }
-
-   crypto_xor((u8 *)&dst, in, GHASH_BLOCK_SIZE);
-   gf128mul_lle(&dst, &key->k);
-   } while (--blocks);
-
-   dg[0] = be64_to_cpu(dst.b);
-   dg[1] = be64_to_cpu(dst.a);
+   ghash_do_update(blocks, dg, src, key, head);
}
 }
 
@@ -131,9 +138,9 @@ static int ghash_update(struct shash_desc *desc, const u8 
*src,
do {
int chunk = min(blocks, MAX_BLOCKS);
 
-   ghash_do_update(chunk, ctx->digest, src, key,
-   partial ? ctx->buf : NULL,
-   pmull_ghash_update_p8);
+   ghash_do_simd_update(chunk, ctx->digest, src, key,
+partial ? ctx->buf : NULL,
+pmull_ghash_update_p8);
 
blocks -= chunk;
src += chunk * GHASH_BLOCK_SIZE;
@@ -155,8 +162,8 @@ static int ghash_final(struct shash_desc *desc, u8 *dst)
 
memset(ctx->buf + partial, 0, GHASH_BLOCK_SIZE - partial);
 
-   ghash_do_update(1, ctx->digest, ctx->buf, key, NULL,
-   pmull_ghash_update_p8);
+   ghash_do_simd_update(1, ctx->digest, ctx->buf, key, NULL,
+pmull_ghash_update_p8);
}
put_unaligned_be64(ctx->digest[1], dst);
put_unaligned_be64(ctx->digest[0], dst + 8);
@@ -280,9 +287,9 @@ static void gcm_update_mac(u64 dg[], const u8 *src, int 
count, u8 buf[],
if (count >= GHASH_BLOCK_SIZE || *buf_count == GHASH_BLOCK_SIZE) {
int blocks = count / GHASH_BLOCK_SIZE;
 
-   ghash_do_update(blocks, dg, src, &ctx->ghash_key,
-   *buf_count ? buf : NULL,
-   pmull_ghash_update_p64);
+   ghash_do_simd_update(blocks, dg

[PATCH 2/5] crypto: arm64/gcm - disentangle ghash and gcm setkey() routines

2020-06-29 Thread Ard Biesheuvel
The remaining ghash implementation does not support aggregation, and so
there is no point in including the precomputed powers of H in the key
struct. So move that into the GCM setkey routine, and get rid of the
shared sub-routine entirely.

Signed-off-by: Ard Biesheuvel 
---
 arch/arm64/crypto/ghash-ce-glue.c | 47 +---
 1 file changed, 22 insertions(+), 25 deletions(-)

diff --git a/arch/arm64/crypto/ghash-ce-glue.c 
b/arch/arm64/crypto/ghash-ce-glue.c
index be63d8b5152c..921fa69b5ded 100644
--- a/arch/arm64/crypto/ghash-ce-glue.c
+++ b/arch/arm64/crypto/ghash-ce-glue.c
@@ -184,29 +184,6 @@ static void ghash_reflect(u64 h[], const be128 *k)
h[1] ^= 0xc200UL;
 }
 
-static int __ghash_setkey(struct ghash_key *key,
- const u8 *inkey, unsigned int keylen)
-{
-   be128 h;
-
-   /* needed for the fallback */
-   memcpy(&key->k, inkey, GHASH_BLOCK_SIZE);
-
-   ghash_reflect(key->h, &key->k);
-
-   h = key->k;
-   gf128mul_lle(&h, &key->k);
-   ghash_reflect(key->h2, &h);
-
-   gf128mul_lle(&h, &key->k);
-   ghash_reflect(key->h3, &h);
-
-   gf128mul_lle(&h, &key->k);
-   ghash_reflect(key->h4, &h);
-
-   return 0;
-}
-
 static int ghash_setkey(struct crypto_shash *tfm,
const u8 *inkey, unsigned int keylen)
 {
@@ -215,7 +192,11 @@ static int ghash_setkey(struct crypto_shash *tfm,
if (keylen != GHASH_BLOCK_SIZE)
return -EINVAL;
 
-   return __ghash_setkey(key, inkey, keylen);
+   /* needed for the fallback */
+   memcpy(&key->k, inkey, GHASH_BLOCK_SIZE);
+
+   ghash_reflect(key->h, &key->k);
+   return 0;
 }
 
 static struct shash_alg ghash_alg = {
@@ -251,6 +232,7 @@ static int gcm_setkey(struct crypto_aead *tfm, const u8 
*inkey,
 {
struct gcm_aes_ctx *ctx = crypto_aead_ctx(tfm);
u8 key[GHASH_BLOCK_SIZE];
+   be128 h;
int ret;
 
ret = aes_expandkey(&ctx->aes_key, inkey, keylen);
@@ -259,7 +241,22 @@ static int gcm_setkey(struct crypto_aead *tfm, const u8 
*inkey,
 
aes_encrypt(&ctx->aes_key, key, (u8[AES_BLOCK_SIZE]){});
 
-   return __ghash_setkey(&ctx->ghash_key, key, sizeof(be128));
+   /* needed for the fallback */
+   memcpy(&ctx->ghash_key.k, key, GHASH_BLOCK_SIZE);
+
+   ghash_reflect(ctx->ghash_key.h, &ctx->ghash_key.k);
+
+   h = ctx->ghash_key.k;
+   gf128mul_lle(&h, &ctx->ghash_key.k);
+   ghash_reflect(ctx->ghash_key.h2, &h);
+
+   gf128mul_lle(&h, &ctx->ghash_key.k);
+   ghash_reflect(ctx->ghash_key.h3, &h);
+
+   gf128mul_lle(&h, &ctx->ghash_key.k);
+   ghash_reflect(ctx->ghash_key.h4, &h);
+
+   return 0;
 }
 
 static int gcm_setauthsize(struct crypto_aead *tfm, unsigned int authsize)
-- 
2.20.1



[PATCH 1/5] crypto: arm64/ghash - drop PMULL based shash

2020-06-29 Thread Ard Biesheuvel
There are two ways to implement SIMD accelerated GCM on arm64:
- using the PMULL instructions for carryless 64x64->128 multiplication,
  in which case the architecture guarantees that the AES instructions are
  available as well, and so we can use the AEAD implementation that combines
  both,
- using the PMULL instructions for carryless 8x8->16 bit multiplication,
  which is implemented as a shash, and can be combined with any ctr(aes)
  implementation by the generic GCM AEAD template driver.

So let's drop the 64x64->128 shash driver, which is never needed for GCM,
and not suitable for use anywhere else.

Signed-off-by: Ard Biesheuvel 
---
 arch/arm64/crypto/ghash-ce-glue.c | 90 +++-
 1 file changed, 12 insertions(+), 78 deletions(-)

diff --git a/arch/arm64/crypto/ghash-ce-glue.c 
b/arch/arm64/crypto/ghash-ce-glue.c
index 22831d3b7f62..be63d8b5152c 100644
--- a/arch/arm64/crypto/ghash-ce-glue.c
+++ b/arch/arm64/crypto/ghash-ce-glue.c
@@ -113,12 +113,8 @@ static void ghash_do_update(int blocks, u64 dg[], const 
char *src,
 /* avoid hogging the CPU for too long */
 #define MAX_BLOCKS (SZ_64K / GHASH_BLOCK_SIZE)
 
-static int __ghash_update(struct shash_desc *desc, const u8 *src,
- unsigned int len,
- void (*simd_update)(int blocks, u64 dg[],
- const char *src,
- struct ghash_key const *k,
- const char *head))
+static int ghash_update(struct shash_desc *desc, const u8 *src,
+   unsigned int len)
 {
struct ghash_desc_ctx *ctx = shash_desc_ctx(desc);
unsigned int partial = ctx->count % GHASH_BLOCK_SIZE;
@@ -145,7 +141,7 @@ static int __ghash_update(struct shash_desc *desc, const u8 
*src,
 
ghash_do_update(chunk, ctx->digest, src, key,
partial ? ctx->buf : NULL,
-   simd_update);
+   pmull_ghash_update_p8);
 
blocks -= chunk;
src += chunk * GHASH_BLOCK_SIZE;
@@ -157,19 +153,7 @@ static int __ghash_update(struct shash_desc *desc, const 
u8 *src,
return 0;
 }
 
-static int ghash_update_p8(struct shash_desc *desc, const u8 *src,
-  unsigned int len)
-{
-   return __ghash_update(desc, src, len, pmull_ghash_update_p8);
-}
-
-static int ghash_update_p64(struct shash_desc *desc, const u8 *src,
-   unsigned int len)
-{
-   return __ghash_update(desc, src, len, pmull_ghash_update_p64);
-}
-
-static int ghash_final_p8(struct shash_desc *desc, u8 *dst)
+static int ghash_final(struct shash_desc *desc, u8 *dst)
 {
struct ghash_desc_ctx *ctx = shash_desc_ctx(desc);
unsigned int partial = ctx->count % GHASH_BLOCK_SIZE;
@@ -189,26 +173,6 @@ static int ghash_final_p8(struct shash_desc *desc, u8 *dst)
return 0;
 }
 
-static int ghash_final_p64(struct shash_desc *desc, u8 *dst)
-{
-   struct ghash_desc_ctx *ctx = shash_desc_ctx(desc);
-   unsigned int partial = ctx->count % GHASH_BLOCK_SIZE;
-
-   if (partial) {
-   struct ghash_key *key = crypto_shash_ctx(desc->tfm);
-
-   memset(ctx->buf + partial, 0, GHASH_BLOCK_SIZE - partial);
-
-   ghash_do_update(1, ctx->digest, ctx->buf, key, NULL,
-   pmull_ghash_update_p64);
-   }
-   put_unaligned_be64(ctx->digest[1], dst);
-   put_unaligned_be64(ctx->digest[0], dst + 8);
-
-   *ctx = (struct ghash_desc_ctx){};
-   return 0;
-}
-
 static void ghash_reflect(u64 h[], const be128 *k)
 {
u64 carry = be64_to_cpu(k->a) & BIT(63) ? 1 : 0;
@@ -254,7 +218,7 @@ static int ghash_setkey(struct crypto_shash *tfm,
return __ghash_setkey(key, inkey, keylen);
 }
 
-static struct shash_alg ghash_alg[] = {{
+static struct shash_alg ghash_alg = {
.base.cra_name  = "ghash",
.base.cra_driver_name   = "ghash-neon",
.base.cra_priority  = 150,
@@ -264,25 +228,11 @@ static struct shash_alg ghash_alg[] = {{
 
.digestsize = GHASH_DIGEST_SIZE,
.init   = ghash_init,
-   .update = ghash_update_p8,
-   .final  = ghash_final_p8,
-   .setkey = ghash_setkey,
-   .descsize   = sizeof(struct ghash_desc_ctx),
-}, {
-   .base.cra_name  = "ghash",
-   .base.cra_driver_name   = "ghash-ce",
-   .base.cra_priority  = 200,
-   .base.cra_blocksize = GHASH_BLOCK_SIZE,
-   .base.cra_ctxsize   = sizeof(struct ghash_key),
-   .base.cra_module= THIS_MODULE,
-
-   .digestsize = GHASH_DIGEST_SIZE,
- 

[PATCH 0/5] crypto: clean up ARM/arm64 glue code for GHASH and GCM

2020-06-29 Thread Ard Biesheuvel
Get rid of pointless indirect calls where the target of the call is decided
at boot and never changes. Also, make the size of the key struct variable,
and only carry the extra keys needed for aggregation when using a version
of the algorithm that makes use of them.

Ard Biesheuvel (5):
  crypto: arm64/ghash - drop PMULL based shash
  crypto: arm64/gcm - disentangle ghash and gcm setkey() routines
  crypto: arm64/gcm - use variably sized key struct
  crypto: arm64/gcm - use inline helper to suppress indirect calls
  crypto: arm/ghash - use variably sized key struct

 arch/arm/crypto/ghash-ce-glue.c   |  51 ++--
 arch/arm64/crypto/ghash-ce-glue.c | 257 +++-
 2 files changed, 118 insertions(+), 190 deletions(-)

-- 
2.20.1



[PATCH v2 00/13] crypto: permit asynchronous skciphers as driver fallbacks

2020-06-27 Thread Ard Biesheuvel
The drivers for crypto accelerators in drivers/crypto all implement skciphers
of an asynchronous nature, given that they are backed by hardware DMA that
completes asynchronously wrt the execution flow.

However, in many cases, any fallbacks they allocate are limited to the
synchronous variety, which rules out the use of SIMD implementations of
AES in ECB, CBC and XTS modes, given that they are usually built on top
of the asynchronous SIMD helper, which queues requests for asynchronous
completion if they are issued from a context that does not permit the use
of the SIMD register file.

This may result in sub-optimal AES implementations to be selected as
fallbacks, or even less secure ones if the only synchronous alternative
is table based, and therefore not time invariant.

So switch all these cases over to the asynchronous API, by moving the
subrequest into the skcipher request context, and permitting it to
complete asynchronously via the caller provided completion function.

Patch #1 is not related, but touches the same driver as #2 so it is
included anyway. Patch #13 removes another sync skcipher allocation by
switching to the AES library interface.

Only OMAP was tested on actual hardware - the others are build tested only.

v2:
- address issue found by build robot in patch #7
- add patch #13
- rebase onto cryptodev/master

Cc: Corentin Labbe 
Cc: Herbert Xu 
Cc: "David S. Miller" 
Cc: Maxime Ripard 
Cc: Chen-Yu Tsai 
Cc: Tom Lendacky 
Cc: Ayush Sawal 
Cc: Vinay Kumar Yadav 
Cc: Rohit Maheshwari 
Cc: Shawn Guo 
Cc: Sascha Hauer 
Cc: Pengutronix Kernel Team 
Cc: Fabio Estevam 
Cc: NXP Linux Team 
Cc: Jamie Iles 
Cc: Eric Biggers 
Cc: Tero Kristo 
Cc: Matthias Brugger 

Ard Biesheuvel (13):
  crypto: amlogic-gxl - default to build as module
  crypto: amlogic-gxl - permit async skcipher as fallback
  crypto: omap-aes - permit asynchronous skcipher as fallback
  crypto: sun4i - permit asynchronous skcipher as fallback
  crypto: sun8i-ce - permit asynchronous skcipher as fallback
  crypto: sun8i-ss - permit asynchronous skcipher as fallback
  crypto: ccp - permit asynchronous skcipher as fallback
  crypto: chelsio - permit asynchronous skcipher as fallback
  crypto: mxs-dcp - permit asynchronous skcipher as fallback
  crypto: picoxcell - permit asynchronous skcipher as fallback
  crypto: qce - permit asynchronous skcipher as fallback
  crypto: sahara - permit asynchronous skcipher as fallback
  crypto: mediatek - use AES library for GCM key derivation

 drivers/crypto/Kconfig  |  3 +-
 drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c | 46 +-
 drivers/crypto/allwinner/sun4i-ss/sun4i-ss.h|  3 +-
 drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c | 41 -
 drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h|  3 +-
 drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c | 39 
 drivers/crypto/allwinner/sun8i-ss/sun8i-ss.h|  3 +-
 drivers/crypto/amlogic/Kconfig  |  2 +-
 drivers/crypto/amlogic/amlogic-gxl-cipher.c | 27 +++---
 drivers/crypto/amlogic/amlogic-gxl.h|  3 +-
 drivers/crypto/ccp/ccp-crypto-aes-xts.c | 33 ---
 drivers/crypto/ccp/ccp-crypto.h |  4 +-
 drivers/crypto/chelsio/chcr_algo.c  | 57 +---
 drivers/crypto/chelsio/chcr_crypto.h|  3 +-
 drivers/crypto/mediatek/mtk-aes.c   | 63 ++---
 drivers/crypto/mxs-dcp.c| 33 +++
 drivers/crypto/omap-aes.c   | 35 ---
 drivers/crypto/omap-aes.h   |  3 +-
 drivers/crypto/picoxcell_crypto.c   | 34 ---
 drivers/crypto/qce/cipher.h |  3 +-
 drivers/crypto/qce/skcipher.c   | 27 +++---
 drivers/crypto/sahara.c | 96 +---
 22 files changed, 254 insertions(+), 307 deletions(-)

-- 
2.27.0



[PATCH v2 13/13] crypto: mediatek - use AES library for GCM key derivation

2020-06-27 Thread Ard Biesheuvel
The Mediatek accelerator driver calls into a dynamically allocated
skcipher of the ctr(aes) variety to perform GCM key derivation, which
involves AES encryption of a single block consisting of NUL bytes.

There is no point in using the skcipher API for this, so use the AES
library interface instead.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/Kconfig|  3 +-
 drivers/crypto/mediatek/mtk-aes.c | 63 +++-
 2 files changed, 9 insertions(+), 57 deletions(-)

diff --git a/drivers/crypto/Kconfig b/drivers/crypto/Kconfig
index 802b9ada4e9e..c8c3ebb248f8 100644
--- a/drivers/crypto/Kconfig
+++ b/drivers/crypto/Kconfig
@@ -756,10 +756,9 @@ config CRYPTO_DEV_ZYNQMP_AES
 config CRYPTO_DEV_MEDIATEK
tristate "MediaTek's EIP97 Cryptographic Engine driver"
depends on (ARM && ARCH_MEDIATEK) || COMPILE_TEST
-   select CRYPTO_AES
+   select CRYPTO_LIB_AES
select CRYPTO_AEAD
select CRYPTO_SKCIPHER
-   select CRYPTO_CTR
select CRYPTO_SHA1
select CRYPTO_SHA256
select CRYPTO_SHA512
diff --git a/drivers/crypto/mediatek/mtk-aes.c 
b/drivers/crypto/mediatek/mtk-aes.c
index 78d660d963e2..4ad3571ab6af 100644
--- a/drivers/crypto/mediatek/mtk-aes.c
+++ b/drivers/crypto/mediatek/mtk-aes.c
@@ -137,8 +137,6 @@ struct mtk_aes_gcm_ctx {
 
u32 authsize;
size_t textlen;
-
-   struct crypto_skcipher *ctr;
 };
 
 struct mtk_aes_drv {
@@ -996,17 +994,8 @@ static int mtk_aes_gcm_setkey(struct crypto_aead *aead, 
const u8 *key,
  u32 keylen)
 {
struct mtk_aes_base_ctx *ctx = crypto_aead_ctx(aead);
-   struct mtk_aes_gcm_ctx *gctx = mtk_aes_gcm_ctx_cast(ctx);
-   struct crypto_skcipher *ctr = gctx->ctr;
-   struct {
-   u32 hash[4];
-   u8 iv[8];
-
-   struct crypto_wait wait;
-
-   struct scatterlist sg[1];
-   struct skcipher_request req;
-   } *data;
+   u8 hash[AES_BLOCK_SIZE] __aligned(4) = {};
+   struct crypto_aes_ctx aes_ctx;
int err;
 
switch (keylen) {
@@ -1026,39 +1015,18 @@ static int mtk_aes_gcm_setkey(struct crypto_aead *aead, 
const u8 *key,
 
ctx->keylen = SIZE_IN_WORDS(keylen);
 
-   /* Same as crypto_gcm_setkey() from crypto/gcm.c */
-   crypto_skcipher_clear_flags(ctr, CRYPTO_TFM_REQ_MASK);
-   crypto_skcipher_set_flags(ctr, crypto_aead_get_flags(aead) &
- CRYPTO_TFM_REQ_MASK);
-   err = crypto_skcipher_setkey(ctr, key, keylen);
+   err = aes_expandkey(&aes_ctx, key, keylen);
if (err)
return err;
 
-   data = kzalloc(sizeof(*data) + crypto_skcipher_reqsize(ctr),
-  GFP_KERNEL);
-   if (!data)
-   return -ENOMEM;
-
-   crypto_init_wait(&data->wait);
-   sg_init_one(data->sg, &data->hash, AES_BLOCK_SIZE);
-   skcipher_request_set_tfm(&data->req, ctr);
-   skcipher_request_set_callback(&data->req, CRYPTO_TFM_REQ_MAY_SLEEP |
- CRYPTO_TFM_REQ_MAY_BACKLOG,
- crypto_req_done, &data->wait);
-   skcipher_request_set_crypt(&data->req, data->sg, data->sg,
-  AES_BLOCK_SIZE, data->iv);
-
-   err = crypto_wait_req(crypto_skcipher_encrypt(&data->req),
- &data->wait);
-   if (err)
-   goto out;
+   aes_encrypt(&aes_ctx, hash, hash);
+   memzero_explicit(&aes_ctx, sizeof(aes_ctx));
 
mtk_aes_write_state_le(ctx->key, (const u32 *)key, keylen);
-   mtk_aes_write_state_be(ctx->key + ctx->keylen, data->hash,
+   mtk_aes_write_state_be(ctx->key + ctx->keylen, (const u32 *)hash,
   AES_BLOCK_SIZE);
-out:
-   kzfree(data);
-   return err;
+
+   return 0;
 }
 
 static int mtk_aes_gcm_setauthsize(struct crypto_aead *aead,
@@ -1095,32 +1063,17 @@ static int mtk_aes_gcm_init(struct crypto_aead *aead)
 {
struct mtk_aes_gcm_ctx *ctx = crypto_aead_ctx(aead);
 
-   ctx->ctr = crypto_alloc_skcipher("ctr(aes)", 0,
-CRYPTO_ALG_ASYNC);
-   if (IS_ERR(ctx->ctr)) {
-   pr_err("Error allocating ctr(aes)\n");
-   return PTR_ERR(ctx->ctr);
-   }
-
crypto_aead_set_reqsize(aead, sizeof(struct mtk_aes_reqctx));
ctx->base.start = mtk_aes_gcm_start;
return 0;
 }
 
-static void mtk_aes_gcm_exit(struct crypto_aead *aead)
-{
-   struct mtk_aes_gcm_ctx *ctx = crypto_aead_ctx(aead);
-
-   crypto_free_skcipher(ctx->ctr);
-}
-
 static struct aead_alg aes_gcm_alg = {
.setkey = mtk_aes_gcm_setkey,
.setauthsize= mtk_aes_gcm_setauthsize,
.encrypt= mtk_a

[PATCH v2 01/13] crypto: amlogic-gxl - default to build as module

2020-06-27 Thread Ard Biesheuvel
The AmLogic GXL crypto accelerator driver is built into the kernel if
ARCH_MESON is set. However, given the single image policy of arm64, its
defconfig enables all platforms by default, and so ARCH_MESON is usually
enabled.

This means that the AmLogic driver causes the arm64 defconfig build to
pull in a huge chunk of the crypto stack as a builtin as well, which is
undesirable, so let's make the amlogic GXL driver default to 'm' instead.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/amlogic/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/amlogic/Kconfig b/drivers/crypto/amlogic/Kconfig
index cf9547602670..cf2c676a7093 100644
--- a/drivers/crypto/amlogic/Kconfig
+++ b/drivers/crypto/amlogic/Kconfig
@@ -1,7 +1,7 @@
 config CRYPTO_DEV_AMLOGIC_GXL
tristate "Support for amlogic cryptographic offloader"
depends on HAS_IOMEM
-   default y if ARCH_MESON
+   default m if ARCH_MESON
select CRYPTO_SKCIPHER
select CRYPTO_ENGINE
select CRYPTO_ECB
-- 
2.27.0



[PATCH v2 02/13] crypto: amlogic-gxl - permit async skcipher as fallback

2020-06-27 Thread Ard Biesheuvel
Even though the amlogic-gxl driver implements asynchronous versions of
ecb(aes) and cbc(aes), the fallbacks it allocates are required to be
synchronous. Given that SIMD based software implementations are usually
asynchronous as well, even though they rarely complete asynchronously
(this typically only happens in cases where the request was made from
softirq context, while SIMD was already in use in the task context that
it interrupted), these implementations are disregarded, and either the
generic C version or another table based version implemented in assembler
is selected instead.

Since falling back to synchronous AES is not only a performance issue,
but potentially a security issue as well (due to the fact that table
based AES is not time invariant), let's fix this, by allocating an
ordinary skcipher as the fallback, and invoke it with the completion
routine that was given to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/amlogic/amlogic-gxl-cipher.c | 27 ++--
 drivers/crypto/amlogic/amlogic-gxl.h|  3 ++-
 2 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/crypto/amlogic/amlogic-gxl-cipher.c 
b/drivers/crypto/amlogic/amlogic-gxl-cipher.c
index 9819dd50fbad..5880b94dcb32 100644
--- a/drivers/crypto/amlogic/amlogic-gxl-cipher.c
+++ b/drivers/crypto/amlogic/amlogic-gxl-cipher.c
@@ -64,22 +64,20 @@ static int meson_cipher_do_fallback(struct skcipher_request 
*areq)
 #ifdef CONFIG_CRYPTO_DEV_AMLOGIC_GXL_DEBUG
struct skcipher_alg *alg = crypto_skcipher_alg(tfm);
struct meson_alg_template *algt;
-#endif
-   SYNC_SKCIPHER_REQUEST_ON_STACK(req, op->fallback_tfm);
 
-#ifdef CONFIG_CRYPTO_DEV_AMLOGIC_GXL_DEBUG
algt = container_of(alg, struct meson_alg_template, alg.skcipher);
algt->stat_fb++;
 #endif
-   skcipher_request_set_sync_tfm(req, op->fallback_tfm);
-   skcipher_request_set_callback(req, areq->base.flags, NULL, NULL);
-   skcipher_request_set_crypt(req, areq->src, areq->dst,
+   skcipher_request_set_tfm(&rctx->fallback_req, op->fallback_tfm);
+   skcipher_request_set_callback(&rctx->fallback_req, areq->base.flags,
+ areq->base.complete, areq->base.data);
+   skcipher_request_set_crypt(&rctx->fallback_req, areq->src, areq->dst,
   areq->cryptlen, areq->iv);
+
if (rctx->op_dir == MESON_DECRYPT)
-   err = crypto_skcipher_decrypt(req);
+   err = crypto_skcipher_decrypt(&rctx->fallback_req);
else
-   err = crypto_skcipher_encrypt(req);
-   skcipher_request_zero(req);
+   err = crypto_skcipher_encrypt(&rctx->fallback_req);
return err;
 }
 
@@ -321,15 +319,16 @@ int meson_cipher_init(struct crypto_tfm *tfm)
algt = container_of(alg, struct meson_alg_template, alg.skcipher);
op->mc = algt->mc;
 
-   sktfm->reqsize = sizeof(struct meson_cipher_req_ctx);
-
-   op->fallback_tfm = crypto_alloc_sync_skcipher(name, 0, 
CRYPTO_ALG_NEED_FALLBACK);
+   op->fallback_tfm = crypto_alloc_skcipher(name, 0, 
CRYPTO_ALG_NEED_FALLBACK);
if (IS_ERR(op->fallback_tfm)) {
dev_err(op->mc->dev, "ERROR: Cannot allocate fallback for %s 
%ld\n",
name, PTR_ERR(op->fallback_tfm));
return PTR_ERR(op->fallback_tfm);
}
 
+   sktfm->reqsize = sizeof(struct meson_cipher_req_ctx) +
+crypto_skcipher_reqsize(op->fallback_tfm);
+
op->enginectx.op.do_one_request = meson_handle_cipher_request;
op->enginectx.op.prepare_request = NULL;
op->enginectx.op.unprepare_request = NULL;
@@ -345,7 +344,7 @@ void meson_cipher_exit(struct crypto_tfm *tfm)
memzero_explicit(op->key, op->keylen);
kfree(op->key);
}
-   crypto_free_sync_skcipher(op->fallback_tfm);
+   crypto_free_skcipher(op->fallback_tfm);
 }
 
 int meson_aes_setkey(struct crypto_skcipher *tfm, const u8 *key,
@@ -377,5 +376,5 @@ int meson_aes_setkey(struct crypto_skcipher *tfm, const u8 
*key,
if (!op->key)
return -ENOMEM;
 
-   return crypto_sync_skcipher_setkey(op->fallback_tfm, key, keylen);
+   return crypto_skcipher_setkey(op->fallback_tfm, key, keylen);
 }
diff --git a/drivers/crypto/amlogic/amlogic-gxl.h 
b/drivers/crypto/amlogic/amlogic-gxl.h
index b7f2de91ab76..dc0f142324a3 100644
--- a/drivers/crypto/amlogic/amlogic-gxl.h
+++ b/drivers/crypto/amlogic/amlogic-gxl.h
@@ -109,6 +109,7 @@ struct meson_dev {
 struct meson_cipher_req_ctx {
u32 op_dir;
int flow;
+   struct skcipher_request fallback_req;   // keep at the end
 };
 
 /*
@@ -126,7 +127,7 @@ struct meson_cipher_tfm_ctx {
u32 keylen;
u32 keymode;
struct meson_dev *mc;
-   struct crypto_sync_skcipher *fallback_tfm;
+   struct crypto_skcipher *fallback_tfm;
 };
 
 /*
-- 
2.27.0



[PATCH v2 03/13] crypto: omap-aes - permit asynchronous skcipher as fallback

2020-06-27 Thread Ard Biesheuvel
Even though the omap-aes driver implements asynchronous versions of
ecb(aes), cbc(aes) and ctr(aes), the fallbacks it allocates are required
to be synchronous. Given that SIMD based software implementations are
usually asynchronous as well, even though they rarely complete
asynchronously (this typically only happens in cases where the request was
made from softirq context, while SIMD was already in use in the task
context that it interrupted), these implementations are disregarded, and
either the generic C version or another table based version implemented in
assembler is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/omap-aes.c | 35 ++--
 drivers/crypto/omap-aes.h |  3 +-
 2 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index b5aff20c5900..25154b74dcc6 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -548,20 +548,18 @@ static int omap_aes_crypt(struct skcipher_request *req, 
unsigned long mode)
  !!(mode & FLAGS_CBC));
 
if (req->cryptlen < aes_fallback_sz) {
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, ctx->fallback);
-
-   skcipher_request_set_sync_tfm(subreq, ctx->fallback);
-   skcipher_request_set_callback(subreq, req->base.flags, NULL,
- NULL);
-   skcipher_request_set_crypt(subreq, req->src, req->dst,
-  req->cryptlen, req->iv);
+   skcipher_request_set_tfm(&rctx->fallback_req, ctx->fallback);
+   skcipher_request_set_callback(&rctx->fallback_req,
+ req->base.flags,
+ req->base.complete,
+ req->base.data);
+   skcipher_request_set_crypt(&rctx->fallback_req, req->src,
+  req->dst, req->cryptlen, req->iv);
 
if (mode & FLAGS_ENCRYPT)
-   ret = crypto_skcipher_encrypt(subreq);
+   ret = crypto_skcipher_encrypt(&rctx->fallback_req);
else
-   ret = crypto_skcipher_decrypt(subreq);
-
-   skcipher_request_zero(subreq);
+   ret = crypto_skcipher_decrypt(&rctx->fallback_req);
return ret;
}
dd = omap_aes_find_dev(rctx);
@@ -590,11 +588,11 @@ static int omap_aes_setkey(struct crypto_skcipher *tfm, 
const u8 *key,
memcpy(ctx->key, key, keylen);
ctx->keylen = keylen;
 
-   crypto_sync_skcipher_clear_flags(ctx->fallback, CRYPTO_TFM_REQ_MASK);
-   crypto_sync_skcipher_set_flags(ctx->fallback, tfm->base.crt_flags &
+   crypto_skcipher_clear_flags(ctx->fallback, CRYPTO_TFM_REQ_MASK);
+   crypto_skcipher_set_flags(ctx->fallback, tfm->base.crt_flags &
 CRYPTO_TFM_REQ_MASK);
 
-   ret = crypto_sync_skcipher_setkey(ctx->fallback, key, keylen);
+   ret = crypto_skcipher_setkey(ctx->fallback, key, keylen);
if (!ret)
return 0;
 
@@ -640,15 +638,16 @@ static int omap_aes_init_tfm(struct crypto_skcipher *tfm)
 {
const char *name = crypto_tfm_alg_name(&tfm->base);
struct omap_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
-   struct crypto_sync_skcipher *blk;
+   struct crypto_skcipher *blk;
 
-   blk = crypto_alloc_sync_skcipher(name, 0, CRYPTO_ALG_NEED_FALLBACK);
+   blk = crypto_alloc_skcipher(name, 0, CRYPTO_ALG_NEED_FALLBACK);
if (IS_ERR(blk))
return PTR_ERR(blk);
 
ctx->fallback = blk;
 
-   crypto_skcipher_set_reqsize(tfm, sizeof(struct omap_aes_reqctx));
+   crypto_skcipher_set_reqsize(tfm, sizeof(struct omap_aes_reqctx) +
+crypto_skcipher_reqsize(blk));
 
ctx->enginectx.op.prepare_request = omap_aes_prepare_req;
ctx->enginectx.op.unprepare_request = NULL;
@@ -662,7 +661,7 @@ static void omap_aes_exit_tfm(struct crypto_skcipher *tfm)
struct omap_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
 
if (ctx->fallback)
-   crypto_free_sync_skcipher(ctx->fallback);
+   crypto_free_skcipher(ctx->fallback);
 
ctx->fallback = NULL;
 }
diff --git a/drivers/crypto/omap-aes.h b/drivers/crypto/omap-aes.h
index 2d111bf906e1..23d073e87bb8 10064

[PATCH v2 04/13] crypto: sun4i - permit asynchronous skcipher as fallback

2020-06-27 Thread Ard Biesheuvel
Even though the sun4i driver implements asynchronous versions of ecb(aes)
and cbc(aes), the fallbacks it allocates are required to be synchronous.
Given that SIMD based software implementations are usually asynchronous
as well, even though they rarely complete asynchronously (this typically
only happens in cases where the request was made from softirq context,
while SIMD was already in use in the task context that it interrupted),
these implementations are disregarded, and either the generic C version
or another table based version implemented in assembler is selected
instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c | 46 ++--
 drivers/crypto/allwinner/sun4i-ss/sun4i-ss.h|  3 +-
 2 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c 
b/drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c
index 7f22d305178e..b72de8939497 100644
--- a/drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c
+++ b/drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c
@@ -122,19 +122,17 @@ static int noinline_for_stack 
sun4i_ss_cipher_poll_fallback(struct skcipher_requ
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(areq);
struct sun4i_tfm_ctx *op = crypto_skcipher_ctx(tfm);
struct sun4i_cipher_req_ctx *ctx = skcipher_request_ctx(areq);
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, op->fallback_tfm);
int err;
 
-   skcipher_request_set_sync_tfm(subreq, op->fallback_tfm);
-   skcipher_request_set_callback(subreq, areq->base.flags, NULL,
- NULL);
-   skcipher_request_set_crypt(subreq, areq->src, areq->dst,
+   skcipher_request_set_tfm(&ctx->fallback_req, op->fallback_tfm);
+   skcipher_request_set_callback(&ctx->fallback_req, areq->base.flags,
+ areq->base.complete, areq->base.data);
+   skcipher_request_set_crypt(&ctx->fallback_req, areq->src, areq->dst,
   areq->cryptlen, areq->iv);
if (ctx->mode & SS_DECRYPTION)
-   err = crypto_skcipher_decrypt(subreq);
+   err = crypto_skcipher_decrypt(&ctx->fallback_req);
else
-   err = crypto_skcipher_encrypt(subreq);
-   skcipher_request_zero(subreq);
+   err = crypto_skcipher_encrypt(&ctx->fallback_req);
 
return err;
 }
@@ -494,23 +492,25 @@ int sun4i_ss_cipher_init(struct crypto_tfm *tfm)
alg.crypto.base);
op->ss = algt->ss;
 
-   crypto_skcipher_set_reqsize(__crypto_skcipher_cast(tfm),
-   sizeof(struct sun4i_cipher_req_ctx));
-
-   op->fallback_tfm = crypto_alloc_sync_skcipher(name, 0, 
CRYPTO_ALG_NEED_FALLBACK);
+   op->fallback_tfm = crypto_alloc_skcipher(name, 0, 
CRYPTO_ALG_NEED_FALLBACK);
if (IS_ERR(op->fallback_tfm)) {
dev_err(op->ss->dev, "ERROR: Cannot allocate fallback for %s 
%ld\n",
name, PTR_ERR(op->fallback_tfm));
return PTR_ERR(op->fallback_tfm);
}
 
+   crypto_skcipher_set_reqsize(__crypto_skcipher_cast(tfm),
+   sizeof(struct sun4i_cipher_req_ctx) +
+   crypto_skcipher_reqsize(op->fallback_tfm));
+
+
err = pm_runtime_get_sync(op->ss->dev);
if (err < 0)
goto error_pm;
 
return 0;
 error_pm:
-   crypto_free_sync_skcipher(op->fallback_tfm);
+   crypto_free_skcipher(op->fallback_tfm);
return err;
 }
 
@@ -518,7 +518,7 @@ void sun4i_ss_cipher_exit(struct crypto_tfm *tfm)
 {
struct sun4i_tfm_ctx *op = crypto_tfm_ctx(tfm);
 
-   crypto_free_sync_skcipher(op->fallback_tfm);
+   crypto_free_skcipher(op->fallback_tfm);
pm_runtime_put(op->ss->dev);
 }
 
@@ -546,10 +546,10 @@ int sun4i_ss_aes_setkey(struct crypto_skcipher *tfm, 
const u8 *key,
op->keylen = keylen;
memcpy(op->key, key, keylen);
 
-   crypto_sync_skcipher_clear_flags(op->fallback_tfm, CRYPTO_TFM_REQ_MASK);
-   crypto_sync_skcipher_set_flags(op->fallback_tfm, tfm->base.crt_flags & 
CRYPTO_TFM_REQ_MASK);
+   crypto_skcipher_clear_flags(op->fallback_tfm, CRYPTO_TFM_REQ_MASK);
+   crypto_skcipher_set_flags(op->fallback_tfm, tfm->base.crt_flags & 
CRYPTO_TFM_REQ_MASK);
 
-   return crypto_sync_skcipher_setkey(op->fa

[PATCH v2 05/13] crypto: sun8i-ce - permit asynchronous skcipher as fallback

2020-06-27 Thread Ard Biesheuvel
Even though the sun8i-ce driver implements asynchronous versions of
ecb(aes) and cbc(aes), the fallbacks it allocates are required to be
synchronous. Given that SIMD based software implementations are usually
asynchronous as well, even though they rarely complete asynchronously
(this typically only happens in cases where the request was made from
softirq context, while SIMD was already in use in the task context that
it interrupted), these implementations are disregarded, and either the
generic C version or another table based version implemented in assembler
is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c | 41 ++--
 drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h|  3 +-
 2 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c 
b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
index a6abb701bfc6..82c99da24dfd 100644
--- a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
+++ b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
@@ -58,23 +58,20 @@ static int sun8i_ce_cipher_fallback(struct skcipher_request 
*areq)
 #ifdef CONFIG_CRYPTO_DEV_SUN8I_CE_DEBUG
struct skcipher_alg *alg = crypto_skcipher_alg(tfm);
struct sun8i_ce_alg_template *algt;
-#endif
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, op->fallback_tfm);
 
-#ifdef CONFIG_CRYPTO_DEV_SUN8I_CE_DEBUG
algt = container_of(alg, struct sun8i_ce_alg_template, alg.skcipher);
algt->stat_fb++;
 #endif
 
-   skcipher_request_set_sync_tfm(subreq, op->fallback_tfm);
-   skcipher_request_set_callback(subreq, areq->base.flags, NULL, NULL);
-   skcipher_request_set_crypt(subreq, areq->src, areq->dst,
+   skcipher_request_set_tfm(&rctx->fallback_req, op->fallback_tfm);
+   skcipher_request_set_callback(&rctx->fallback_req, areq->base.flags,
+ areq->base.complete, areq->base.data);
+   skcipher_request_set_crypt(&rctx->fallback_req, areq->src, areq->dst,
   areq->cryptlen, areq->iv);
if (rctx->op_dir & CE_DECRYPTION)
-   err = crypto_skcipher_decrypt(subreq);
+   err = crypto_skcipher_decrypt(&rctx->fallback_req);
else
-   err = crypto_skcipher_encrypt(subreq);
-   skcipher_request_zero(subreq);
+   err = crypto_skcipher_encrypt(&rctx->fallback_req);
return err;
 }
 
@@ -335,18 +332,20 @@ int sun8i_ce_cipher_init(struct crypto_tfm *tfm)
algt = container_of(alg, struct sun8i_ce_alg_template, alg.skcipher);
op->ce = algt->ce;
 
-   sktfm->reqsize = sizeof(struct sun8i_cipher_req_ctx);
-
-   op->fallback_tfm = crypto_alloc_sync_skcipher(name, 0, 
CRYPTO_ALG_NEED_FALLBACK);
+   op->fallback_tfm = crypto_alloc_skcipher(name, 0, 
CRYPTO_ALG_NEED_FALLBACK);
if (IS_ERR(op->fallback_tfm)) {
dev_err(op->ce->dev, "ERROR: Cannot allocate fallback for %s 
%ld\n",
name, PTR_ERR(op->fallback_tfm));
return PTR_ERR(op->fallback_tfm);
}
 
+   sktfm->reqsize = sizeof(struct sun8i_cipher_req_ctx) +
+crypto_skcipher_reqsize(op->fallback_tfm);
+
+
dev_info(op->ce->dev, "Fallback for %s is %s\n",
 crypto_tfm_alg_driver_name(&sktfm->base),
-
crypto_tfm_alg_driver_name(crypto_skcipher_tfm(&op->fallback_tfm->base)));
+
crypto_tfm_alg_driver_name(crypto_skcipher_tfm(op->fallback_tfm)));
 
op->enginectx.op.do_one_request = sun8i_ce_handle_cipher_request;
op->enginectx.op.prepare_request = NULL;
@@ -358,7 +357,7 @@ int sun8i_ce_cipher_init(struct crypto_tfm *tfm)
 
return 0;
 error_pm:
-   crypto_free_sync_skcipher(op->fallback_tfm);
+   crypto_free_skcipher(op->fallback_tfm);
return err;
 }
 
@@ -370,7 +369,7 @@ void sun8i_ce_cipher_exit(struct crypto_tfm *tfm)
memzero_explicit(op->key, op->keylen);
kfree(op->key);
}
-   crypto_free_sync_skcipher(op->fallback_tfm);
+   crypto_free_skcipher(op->fallback_tfm);
pm_runtime_put_sync_suspend(op->ce->dev);
 }
 
@@ -400,10 +399,10 @@ int sun8i_ce_aes_setkey(struct crypto_skcipher *tfm, 
const u8 *key,
if (!op->key)
return -ENOMEM;
 
-   crypto_sync_skcipher_clear_flags(op->f

[PATCH v2 11/13] crypto: qce - permit asynchronous skcipher as fallback

2020-06-27 Thread Ard Biesheuvel
Even though the qce driver implements asynchronous versions of ecb(aes),
cbc(aes)and xts(aes), the fallbacks it allocates are required to be
synchronous. Given that SIMD based software implementations are usually
asynchronous as well, even though they rarely complete asynchronously
(this typically only happens in cases where the request was made from
softirq context, while SIMD was already in use in the task context that
it interrupted), these implementations are disregarded, and either the
generic C version or another table based version implemented in assembler
is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/qce/cipher.h   |  3 ++-
 drivers/crypto/qce/skcipher.c | 27 ++--
 2 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/drivers/crypto/qce/cipher.h b/drivers/crypto/qce/cipher.h
index 7770660bc853..cffa9fc628ff 100644
--- a/drivers/crypto/qce/cipher.h
+++ b/drivers/crypto/qce/cipher.h
@@ -14,7 +14,7 @@
 struct qce_cipher_ctx {
u8 enc_key[QCE_MAX_KEY_SIZE];
unsigned int enc_keylen;
-   struct crypto_sync_skcipher *fallback;
+   struct crypto_skcipher *fallback;
 };
 
 /**
@@ -43,6 +43,7 @@ struct qce_cipher_reqctx {
struct sg_table src_tbl;
struct scatterlist *src_sg;
unsigned int cryptlen;
+   struct skcipher_request fallback_req;   // keep at the end
 };
 
 static inline struct qce_alg_template *to_cipher_tmpl(struct crypto_skcipher 
*tfm)
diff --git a/drivers/crypto/qce/skcipher.c b/drivers/crypto/qce/skcipher.c
index 9412433f3b21..265afae29901 100644
--- a/drivers/crypto/qce/skcipher.c
+++ b/drivers/crypto/qce/skcipher.c
@@ -178,7 +178,7 @@ static int qce_skcipher_setkey(struct crypto_skcipher 
*ablk, const u8 *key,
break;
}
 
-   ret = crypto_sync_skcipher_setkey(ctx->fallback, key, keylen);
+   ret = crypto_skcipher_setkey(ctx->fallback, key, keylen);
if (!ret)
ctx->enc_keylen = keylen;
return ret;
@@ -235,16 +235,15 @@ static int qce_skcipher_crypt(struct skcipher_request 
*req, int encrypt)
  req->cryptlen <= aes_sw_max_len) ||
 (IS_XTS(rctx->flags) && req->cryptlen > QCE_SECTOR_SIZE &&
  req->cryptlen % QCE_SECTOR_SIZE))) {
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, ctx->fallback);
-
-   skcipher_request_set_sync_tfm(subreq, ctx->fallback);
-   skcipher_request_set_callback(subreq, req->base.flags,
- NULL, NULL);
-   skcipher_request_set_crypt(subreq, req->src, req->dst,
-  req->cryptlen, req->iv);
-   ret = encrypt ? crypto_skcipher_encrypt(subreq) :
-   crypto_skcipher_decrypt(subreq);
-   skcipher_request_zero(subreq);
+   skcipher_request_set_tfm(&rctx->fallback_req, ctx->fallback);
+   skcipher_request_set_callback(&rctx->fallback_req,
+ req->base.flags,
+ req->base.complete,
+ req->base.data);
+   skcipher_request_set_crypt(&rctx->fallback_req, req->src,
+  req->dst, req->cryptlen, req->iv);
+   ret = encrypt ? crypto_skcipher_encrypt(&rctx->fallback_req) :
+   crypto_skcipher_decrypt(&rctx->fallback_req);
return ret;
}
 
@@ -275,8 +274,10 @@ static int qce_skcipher_init_fallback(struct 
crypto_skcipher *tfm)
struct qce_cipher_ctx *ctx = crypto_skcipher_ctx(tfm);
 
qce_skcipher_init(tfm);
-   ctx->fallback = 
crypto_alloc_sync_skcipher(crypto_tfm_alg_name(&tfm->base),
+   ctx->fallback = crypto_alloc_skcipher(crypto_tfm_alg_name(&tfm->base),
   0, CRYPTO_ALG_NEED_FALLBACK);
+   crypto_skcipher_set_reqsize(tfm, sizeof(struct qce_cipher_reqctx) +
+
crypto_skcipher_reqsize(ctx->fallback));
return PTR_ERR_OR_ZERO(ctx->fallback);
 }
 
@@ -284,7 +285,7 @@ static void qce_skcipher_exit(struct crypto_skcipher *tfm)
 {
struct qce_cipher_ctx *ctx = crypto_skcipher_ctx(tfm);
 
-   crypto_free_sync_skcipher(ctx->fallback);
+   crypto_free_skcipher(ctx->fallback);
 }
 
 struct qce_skcipher_def {
-- 
2.27.0



[PATCH v2 07/13] crypto: ccp - permit asynchronous skcipher as fallback

2020-06-27 Thread Ard Biesheuvel
Even though the ccp driver implements an asynchronous version of xts(aes),
the fallback it allocates is required to be synchronous. Given that SIMD
based software implementations are usually asynchronous as well, even
though they rarely complete asynchronously (this typically only happens
in cases where the request was made from softirq context, while SIMD was
already in use in the task context that it interrupted), these
implementations are disregarded, and either the generic C version or
another table based version implemented in assembler is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/ccp/ccp-crypto-aes-xts.c | 33 ++--
 drivers/crypto/ccp/ccp-crypto.h |  4 ++-
 2 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/drivers/crypto/ccp/ccp-crypto-aes-xts.c 
b/drivers/crypto/ccp/ccp-crypto-aes-xts.c
index 04b2517df955..959168a7ac59 100644
--- a/drivers/crypto/ccp/ccp-crypto-aes-xts.c
+++ b/drivers/crypto/ccp/ccp-crypto-aes-xts.c
@@ -98,7 +98,7 @@ static int ccp_aes_xts_setkey(struct crypto_skcipher *tfm, 
const u8 *key,
ctx->u.aes.key_len = key_len / 2;
sg_init_one(&ctx->u.aes.key_sg, ctx->u.aes.key, key_len);
 
-   return crypto_sync_skcipher_setkey(ctx->u.aes.tfm_skcipher, key, 
key_len);
+   return crypto_skcipher_setkey(ctx->u.aes.tfm_skcipher, key, key_len);
 }
 
 static int ccp_aes_xts_crypt(struct skcipher_request *req,
@@ -145,20 +145,19 @@ static int ccp_aes_xts_crypt(struct skcipher_request *req,
(ctx->u.aes.key_len != AES_KEYSIZE_256))
fallback = 1;
if (fallback) {
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq,
-  ctx->u.aes.tfm_skcipher);
-
/* Use the fallback to process the request for any
 * unsupported unit sizes or key sizes
 */
-   skcipher_request_set_sync_tfm(subreq, ctx->u.aes.tfm_skcipher);
-   skcipher_request_set_callback(subreq, req->base.flags,
- NULL, NULL);
-   skcipher_request_set_crypt(subreq, req->src, req->dst,
-  req->cryptlen, req->iv);
-   ret = encrypt ? crypto_skcipher_encrypt(subreq) :
-   crypto_skcipher_decrypt(subreq);
-   skcipher_request_zero(subreq);
+   skcipher_request_set_tfm(&rctx->fallback_req,
+ctx->u.aes.tfm_skcipher);
+   skcipher_request_set_callback(&rctx->fallback_req,
+ req->base.flags,
+ req->base.complete,
+ req->base.data);
+   skcipher_request_set_crypt(&rctx->fallback_req, req->src,
+  req->dst, req->cryptlen, req->iv);
+   ret = encrypt ? crypto_skcipher_encrypt(&rctx->fallback_req) :
+   crypto_skcipher_decrypt(&rctx->fallback_req);
return ret;
}
 
@@ -198,13 +197,12 @@ static int ccp_aes_xts_decrypt(struct skcipher_request 
*req)
 static int ccp_aes_xts_init_tfm(struct crypto_skcipher *tfm)
 {
struct ccp_ctx *ctx = crypto_skcipher_ctx(tfm);
-   struct crypto_sync_skcipher *fallback_tfm;
+   struct crypto_skcipher *fallback_tfm;
 
ctx->complete = ccp_aes_xts_complete;
ctx->u.aes.key_len = 0;
 
-   fallback_tfm = crypto_alloc_sync_skcipher("xts(aes)", 0,
-CRYPTO_ALG_ASYNC |
+   fallback_tfm = crypto_alloc_skcipher("xts(aes)", 0,
 CRYPTO_ALG_NEED_FALLBACK);
if (IS_ERR(fallback_tfm)) {
pr_warn("could not load fallback driver xts(aes)\n");
@@ -212,7 +210,8 @@ static int ccp_aes_xts_init_tfm(struct crypto_skcipher *tfm)
}
ctx->u.aes.tfm_skcipher = fallback_tfm;
 
-   crypto_skcipher_set_reqsize(tfm, sizeof(struct ccp_aes_req_ctx));
+   crypto_skcipher_set_reqsize(tfm, sizeof(struct ccp_aes_req_ctx) +
+crypto_skcipher_reqsize(fallback_tfm));
 
return 0;
 }
@@ -221,7 +220,7 @@ static void ccp_aes_xts_exit_tfm(struct crypto_skcipher 
*tfm)
 {
struct ccp_ctx *ctx = crypto_skcipher_ctx(tfm);
 
-   crypto_free_sync_skcipher(ctx->u.aes.tfm_skcipher);
+   crypto_fre

[PATCH v2 06/13] crypto: sun8i-ss - permit asynchronous skcipher as fallback

2020-06-27 Thread Ard Biesheuvel
Even though the sun8i-ss driver implements asynchronous versions of
ecb(aes) and cbc(aes), the fallbacks it allocates are required to be
synchronous. Given that SIMD based software implementations are usually
asynchronous as well, even though they rarely complete asynchronously
(this typically only happens in cases where the request was made from
softirq context, while SIMD was already in use in the task context that
it interrupted), these implementations are disregarded, and either the
generic C version or another table based version implemented in assembler
is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c | 39 ++--
 drivers/crypto/allwinner/sun8i-ss/sun8i-ss.h|  3 +-
 2 files changed, 22 insertions(+), 20 deletions(-)

diff --git a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c 
b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c
index c89cb2ee2496..7a131675a41c 100644
--- a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c
+++ b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c
@@ -73,7 +73,6 @@ static int sun8i_ss_cipher_fallback(struct skcipher_request 
*areq)
struct sun8i_cipher_req_ctx *rctx = skcipher_request_ctx(areq);
int err;
 
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, op->fallback_tfm);
 #ifdef CONFIG_CRYPTO_DEV_SUN8I_SS_DEBUG
struct skcipher_alg *alg = crypto_skcipher_alg(tfm);
struct sun8i_ss_alg_template *algt;
@@ -81,15 +80,15 @@ static int sun8i_ss_cipher_fallback(struct skcipher_request 
*areq)
algt = container_of(alg, struct sun8i_ss_alg_template, alg.skcipher);
algt->stat_fb++;
 #endif
-   skcipher_request_set_sync_tfm(subreq, op->fallback_tfm);
-   skcipher_request_set_callback(subreq, areq->base.flags, NULL, NULL);
-   skcipher_request_set_crypt(subreq, areq->src, areq->dst,
+   skcipher_request_set_tfm(&rctx->fallback_req, op->fallback_tfm);
+   skcipher_request_set_callback(&rctx->fallback_req, areq->base.flags,
+ areq->base.complete, areq->base.data);
+   skcipher_request_set_crypt(&rctx->fallback_req, areq->src, areq->dst,
   areq->cryptlen, areq->iv);
if (rctx->op_dir & SS_DECRYPTION)
-   err = crypto_skcipher_decrypt(subreq);
+   err = crypto_skcipher_decrypt(&rctx->fallback_req);
else
-   err = crypto_skcipher_encrypt(subreq);
-   skcipher_request_zero(subreq);
+   err = crypto_skcipher_encrypt(&rctx->fallback_req);
return err;
 }
 
@@ -334,18 +333,20 @@ int sun8i_ss_cipher_init(struct crypto_tfm *tfm)
algt = container_of(alg, struct sun8i_ss_alg_template, alg.skcipher);
op->ss = algt->ss;
 
-   sktfm->reqsize = sizeof(struct sun8i_cipher_req_ctx);
-
-   op->fallback_tfm = crypto_alloc_sync_skcipher(name, 0, 
CRYPTO_ALG_NEED_FALLBACK);
+   op->fallback_tfm = crypto_alloc_skcipher(name, 0, 
CRYPTO_ALG_NEED_FALLBACK);
if (IS_ERR(op->fallback_tfm)) {
dev_err(op->ss->dev, "ERROR: Cannot allocate fallback for %s 
%ld\n",
name, PTR_ERR(op->fallback_tfm));
return PTR_ERR(op->fallback_tfm);
}
 
+   sktfm->reqsize = sizeof(struct sun8i_cipher_req_ctx) +
+crypto_skcipher_reqsize(op->fallback_tfm);
+
+
dev_info(op->ss->dev, "Fallback for %s is %s\n",
 crypto_tfm_alg_driver_name(&sktfm->base),
-
crypto_tfm_alg_driver_name(crypto_skcipher_tfm(&op->fallback_tfm->base)));
+
crypto_tfm_alg_driver_name(crypto_skcipher_tfm(op->fallback_tfm)));
 
op->enginectx.op.do_one_request = sun8i_ss_handle_cipher_request;
op->enginectx.op.prepare_request = NULL;
@@ -359,7 +360,7 @@ int sun8i_ss_cipher_init(struct crypto_tfm *tfm)
 
return 0;
 error_pm:
-   crypto_free_sync_skcipher(op->fallback_tfm);
+   crypto_free_skcipher(op->fallback_tfm);
return err;
 }
 
@@ -371,7 +372,7 @@ void sun8i_ss_cipher_exit(struct crypto_tfm *tfm)
memzero_explicit(op->key, op->keylen);
kfree(op->key);
}
-   crypto_free_sync_skcipher(op->fallback_tfm);
+   crypto_free_skcipher(op->fallback_tfm);
pm_runtime_put_sync(op->ss->dev);
 }
 
@@ -401,10 +402,10 @@ int sun8i_ss_aes_setkey(struct crypto_skcipher *tfm, 
const u8

[PATCH v2 08/13] crypto: chelsio - permit asynchronous skcipher as fallback

2020-06-27 Thread Ard Biesheuvel
Even though the chelsio driver implements asynchronous versions of
cbc(aes) and xts(aes), the fallbacks it allocates are required to be
synchronous. Given that SIMD based software implementations are usually
asynchronous as well, even though they rarely complete asynchronously
(this typically only happens in cases where the request was made from
softirq context, while SIMD was already in use in the task context that
it interrupted), these implementations are disregarded, and either the
generic C version or another table based version implemented in assembler
is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/chelsio/chcr_algo.c   | 57 
 drivers/crypto/chelsio/chcr_crypto.h |  3 +-
 2 files changed, 25 insertions(+), 35 deletions(-)

diff --git a/drivers/crypto/chelsio/chcr_algo.c 
b/drivers/crypto/chelsio/chcr_algo.c
index 4c2553672b6f..a6625b90fb1a 100644
--- a/drivers/crypto/chelsio/chcr_algo.c
+++ b/drivers/crypto/chelsio/chcr_algo.c
@@ -690,26 +690,22 @@ static int chcr_sg_ent_in_wr(struct scatterlist *src,
return min(srclen, dstlen);
 }
 
-static int chcr_cipher_fallback(struct crypto_sync_skcipher *cipher,
-   u32 flags,
-   struct scatterlist *src,
-   struct scatterlist *dst,
-   unsigned int nbytes,
+static int chcr_cipher_fallback(struct crypto_skcipher *cipher,
+   struct skcipher_request *req,
u8 *iv,
unsigned short op_type)
 {
+   struct chcr_skcipher_req_ctx *reqctx = skcipher_request_ctx(req);
int err;
 
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, cipher);
-
-   skcipher_request_set_sync_tfm(subreq, cipher);
-   skcipher_request_set_callback(subreq, flags, NULL, NULL);
-   skcipher_request_set_crypt(subreq, src, dst,
-  nbytes, iv);
+   skcipher_request_set_tfm(&reqctx->fallback_req, cipher);
+   skcipher_request_set_callback(&reqctx->fallback_req, req->base.flags,
+ req->base.complete, req->base.data);
+   skcipher_request_set_crypt(&reqctx->fallback_req, req->src, req->dst,
+  req->cryptlen, iv);
 
-   err = op_type ? crypto_skcipher_decrypt(subreq) :
-   crypto_skcipher_encrypt(subreq);
-   skcipher_request_zero(subreq);
+   err = op_type ? crypto_skcipher_decrypt(&reqctx->fallback_req) :
+   crypto_skcipher_encrypt(&reqctx->fallback_req);
 
return err;
 
@@ -924,11 +920,11 @@ static int chcr_cipher_fallback_setkey(struct 
crypto_skcipher *cipher,
 {
struct ablk_ctx *ablkctx = ABLK_CTX(c_ctx(cipher));
 
-   crypto_sync_skcipher_clear_flags(ablkctx->sw_cipher,
+   crypto_skcipher_clear_flags(ablkctx->sw_cipher,
CRYPTO_TFM_REQ_MASK);
-   crypto_sync_skcipher_set_flags(ablkctx->sw_cipher,
+   crypto_skcipher_set_flags(ablkctx->sw_cipher,
cipher->base.crt_flags & CRYPTO_TFM_REQ_MASK);
-   return crypto_sync_skcipher_setkey(ablkctx->sw_cipher, key, keylen);
+   return crypto_skcipher_setkey(ablkctx->sw_cipher, key, keylen);
 }
 
 static int chcr_aes_cbc_setkey(struct crypto_skcipher *cipher,
@@ -1206,13 +1202,8 @@ static int chcr_handle_cipher_resp(struct 
skcipher_request *req,
  req);
memcpy(req->iv, reqctx->init_iv, IV);
atomic_inc(&adap->chcr_stats.fallback);
-   err = chcr_cipher_fallback(ablkctx->sw_cipher,
-req->base.flags,
-req->src,
-req->dst,
-req->cryptlen,
-req->iv,
-reqctx->op);
+   err = chcr_cipher_fallback(ablkctx->sw_cipher, req, req->iv,
+  reqctx->op);
goto complete;
}
 
@@ -1341,11 +1332,7 @@ static int process_cipher(struct skcipher_request *req,
chcr_cipher_dma_unmap(&ULD_CTX(c_ctx(tfm))->lldi.pdev->dev,
  req);
 fallback:   atomic_inc(&adap->chcr_stats.fallback);
-   err = chcr_cipher_fallback(ablkctx->sw_cipher,
- 

[PATCH v2 12/13] crypto: sahara - permit asynchronous skcipher as fallback

2020-06-27 Thread Ard Biesheuvel
Even though the sahara driver implements asynchronous versions of
ecb(aes) and cbc(aes), the fallbacks it allocates are required to be
synchronous. Given that SIMD based software implementations are usually
asynchronous as well, even though they rarely complete asynchronously
(this typically only happens in cases where the request was made from
softirq context, while SIMD was already in use in the task context that
it interrupted), these implementations are disregarded, and either the
generic C version or another table based version implemented in assembler
is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/sahara.c | 96 +---
 1 file changed, 45 insertions(+), 51 deletions(-)

diff --git a/drivers/crypto/sahara.c b/drivers/crypto/sahara.c
index 466e30bd529c..0c8cb23ae708 100644
--- a/drivers/crypto/sahara.c
+++ b/drivers/crypto/sahara.c
@@ -146,11 +146,12 @@ struct sahara_ctx {
/* AES-specific context */
int keylen;
u8 key[AES_KEYSIZE_128];
-   struct crypto_sync_skcipher *fallback;
+   struct crypto_skcipher *fallback;
 };
 
 struct sahara_aes_reqctx {
unsigned long mode;
+   struct skcipher_request fallback_req;   // keep at the end
 };
 
 /*
@@ -617,10 +618,10 @@ static int sahara_aes_setkey(struct crypto_skcipher *tfm, 
const u8 *key,
/*
 * The requested key size is not supported by HW, do a fallback.
 */
-   crypto_sync_skcipher_clear_flags(ctx->fallback, CRYPTO_TFM_REQ_MASK);
-   crypto_sync_skcipher_set_flags(ctx->fallback, tfm->base.crt_flags &
+   crypto_skcipher_clear_flags(ctx->fallback, CRYPTO_TFM_REQ_MASK);
+   crypto_skcipher_set_flags(ctx->fallback, tfm->base.crt_flags &
 CRYPTO_TFM_REQ_MASK);
-   return crypto_sync_skcipher_setkey(ctx->fallback, key, keylen);
+   return crypto_skcipher_setkey(ctx->fallback, key, keylen);
 }
 
 static int sahara_aes_crypt(struct skcipher_request *req, unsigned long mode)
@@ -651,21 +652,19 @@ static int sahara_aes_crypt(struct skcipher_request *req, 
unsigned long mode)
 
 static int sahara_aes_ecb_encrypt(struct skcipher_request *req)
 {
+   struct sahara_aes_reqctx *rctx = skcipher_request_ctx(req);
struct sahara_ctx *ctx = crypto_skcipher_ctx(
crypto_skcipher_reqtfm(req));
-   int err;
 
if (unlikely(ctx->keylen != AES_KEYSIZE_128)) {
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, ctx->fallback);
-
-   skcipher_request_set_sync_tfm(subreq, ctx->fallback);
-   skcipher_request_set_callback(subreq, req->base.flags,
- NULL, NULL);
-   skcipher_request_set_crypt(subreq, req->src, req->dst,
-  req->cryptlen, req->iv);
-   err = crypto_skcipher_encrypt(subreq);
-   skcipher_request_zero(subreq);
-   return err;
+   skcipher_request_set_tfm(&rctx->fallback_req, ctx->fallback);
+   skcipher_request_set_callback(&rctx->fallback_req,
+ req->base.flags,
+ req->base.complete,
+ req->base.data);
+   skcipher_request_set_crypt(&rctx->fallback_req, req->src,
+  req->dst, req->cryptlen, req->iv);
+   return crypto_skcipher_encrypt(&rctx->fallback_req);
}
 
return sahara_aes_crypt(req, FLAGS_ENCRYPT);
@@ -673,21 +672,19 @@ static int sahara_aes_ecb_encrypt(struct skcipher_request 
*req)
 
 static int sahara_aes_ecb_decrypt(struct skcipher_request *req)
 {
+   struct sahara_aes_reqctx *rctx = skcipher_request_ctx(req);
struct sahara_ctx *ctx = crypto_skcipher_ctx(
crypto_skcipher_reqtfm(req));
-   int err;
 
if (unlikely(ctx->keylen != AES_KEYSIZE_128)) {
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, ctx->fallback);
-
-   skcipher_request_set_sync_tfm(subreq, ctx->fallback);
-   skcipher_request_set_callback(subreq, req->base.flags,
- NULL, NULL);
-   skcipher_request_set_crypt(subreq, req->src, req->dst,
-  req->cryptlen, req->iv);
-   err = crypto_skcipher_decrypt(subreq);
-   skcipher_request_zero(subreq);
- 

[PATCH v2 10/13] crypto: picoxcell - permit asynchronous skcipher as fallback

2020-06-27 Thread Ard Biesheuvel
Even though the picoxcell driver implements asynchronous versions of
ecb(aes) and cbc(aes), the fallbacks it allocates are required to be
synchronous. Given that SIMD based software implementations are usually
asynchronous as well, even though they rarely complete asynchronously
(this typically only happens in cases where the request was made from
softirq context, while SIMD was already in use in the task context that
it interrupted), these implementations are disregarded, and either the
generic C version or another table based version implemented in assembler
is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/picoxcell_crypto.c | 34 +++-
 1 file changed, 19 insertions(+), 15 deletions(-)

diff --git a/drivers/crypto/picoxcell_crypto.c 
b/drivers/crypto/picoxcell_crypto.c
index 7384e91c8b32..eea75c7cbdf2 100644
--- a/drivers/crypto/picoxcell_crypto.c
+++ b/drivers/crypto/picoxcell_crypto.c
@@ -86,6 +86,7 @@ struct spacc_req {
dma_addr_t  src_addr, dst_addr;
struct spacc_ddt*src_ddt, *dst_ddt;
void(*complete)(struct spacc_req *req);
+   struct skcipher_request fallback_req;   // keep at the end
 };
 
 struct spacc_aead {
@@ -158,7 +159,7 @@ struct spacc_ablk_ctx {
 * The fallback cipher. If the operation can't be done in hardware,
 * fallback to a software version.
 */
-   struct crypto_sync_skcipher *sw_cipher;
+   struct crypto_skcipher  *sw_cipher;
 };
 
 /* AEAD cipher context. */
@@ -792,13 +793,13 @@ static int spacc_aes_setkey(struct crypto_skcipher 
*cipher, const u8 *key,
 * Set the fallback transform to use the same request flags as
 * the hardware transform.
 */
-   crypto_sync_skcipher_clear_flags(ctx->sw_cipher,
+   crypto_skcipher_clear_flags(ctx->sw_cipher,
CRYPTO_TFM_REQ_MASK);
-   crypto_sync_skcipher_set_flags(ctx->sw_cipher,
+   crypto_skcipher_set_flags(ctx->sw_cipher,
  cipher->base.crt_flags &
  CRYPTO_TFM_REQ_MASK);
 
-   err = crypto_sync_skcipher_setkey(ctx->sw_cipher, key, len);
+   err = crypto_skcipher_setkey(ctx->sw_cipher, key, len);
if (err)
goto sw_setkey_failed;
}
@@ -900,7 +901,7 @@ static int spacc_ablk_do_fallback(struct skcipher_request 
*req,
struct crypto_tfm *old_tfm =
crypto_skcipher_tfm(crypto_skcipher_reqtfm(req));
struct spacc_ablk_ctx *ctx = crypto_tfm_ctx(old_tfm);
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, ctx->sw_cipher);
+   struct spacc_req *dev_req = skcipher_request_ctx(req);
int err;
 
/*
@@ -908,13 +909,13 @@ static int spacc_ablk_do_fallback(struct skcipher_request 
*req,
 * the ciphering has completed, put the old transform back into the
 * request.
 */
-   skcipher_request_set_sync_tfm(subreq, ctx->sw_cipher);
-   skcipher_request_set_callback(subreq, req->base.flags, NULL, NULL);
-   skcipher_request_set_crypt(subreq, req->src, req->dst,
+   skcipher_request_set_tfm(&dev_req->fallback_req, ctx->sw_cipher);
+   skcipher_request_set_callback(&dev_req->fallback_req, req->base.flags,
+ req->base.complete, req->base.data);
+   skcipher_request_set_crypt(&dev_req->fallback_req, req->src, req->dst,
   req->cryptlen, req->iv);
-   err = is_encrypt ? crypto_skcipher_encrypt(subreq) :
-  crypto_skcipher_decrypt(subreq);
-   skcipher_request_zero(subreq);
+   err = is_encrypt ? crypto_skcipher_encrypt(&dev_req->fallback_req) :
+  crypto_skcipher_decrypt(&dev_req->fallback_req);
 
return err;
 }
@@ -1007,19 +1008,22 @@ static int spacc_ablk_init_tfm(struct crypto_skcipher 
*tfm)
ctx->generic.flags = spacc_alg->type;
ctx->generic.engine = engine;
if (alg->base.cra_flags & CRYPTO_ALG_NEED_FALLBACK) {
-   ctx->sw_cipher = crypto_alloc_sync_skcipher(
+   ctx->sw_cipher = crypto_alloc_skcipher(
alg->base.cra_name, 0, CRYPTO_ALG_NEED_FALLBACK);
if (IS_ERR(ctx->sw_cipher)) {
dev_warn(engine->dev, "fa

[PATCH v2 09/13] crypto: mxs-dcp - permit asynchronous skcipher as fallback

2020-06-27 Thread Ard Biesheuvel
Even though the mxs-dcp driver implements asynchronous versions of
ecb(aes) and cbc(aes), the fallbacks it allocates are required to be
synchronous. Given that SIMD based software implementations are usually
asynchronous as well, even though they rarely complete asynchronously
(this typically only happens in cases where the request was made from
softirq context, while SIMD was already in use in the task context that
it interrupted), these implementations are disregarded, and either the
generic C version or another table based version implemented in assembler
is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/mxs-dcp.c | 33 ++--
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/drivers/crypto/mxs-dcp.c b/drivers/crypto/mxs-dcp.c
index d84530293036..909a7eb748e3 100644
--- a/drivers/crypto/mxs-dcp.c
+++ b/drivers/crypto/mxs-dcp.c
@@ -97,7 +97,7 @@ struct dcp_async_ctx {
unsigned inthot:1;
 
/* Crypto-specific context */
-   struct crypto_sync_skcipher *fallback;
+   struct crypto_skcipher  *fallback;
unsigned intkey_len;
uint8_t key[AES_KEYSIZE_128];
 };
@@ -105,6 +105,7 @@ struct dcp_async_ctx {
 struct dcp_aes_req_ctx {
unsigned intenc:1;
unsigned intecb:1;
+   struct skcipher_request fallback_req;   // keep at the end
 };
 
 struct dcp_sha_req_ctx {
@@ -426,21 +427,20 @@ static int dcp_chan_thread_aes(void *data)
 static int mxs_dcp_block_fallback(struct skcipher_request *req, int enc)
 {
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+   struct dcp_aes_req_ctx *rctx = skcipher_request_ctx(req);
struct dcp_async_ctx *ctx = crypto_skcipher_ctx(tfm);
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, ctx->fallback);
int ret;
 
-   skcipher_request_set_sync_tfm(subreq, ctx->fallback);
-   skcipher_request_set_callback(subreq, req->base.flags, NULL, NULL);
-   skcipher_request_set_crypt(subreq, req->src, req->dst,
+   skcipher_request_set_tfm(&rctx->fallback_req, ctx->fallback);
+   skcipher_request_set_callback(&rctx->fallback_req, req->base.flags,
+ req->base.complete, req->base.data);
+   skcipher_request_set_crypt(&rctx->fallback_req, req->src, req->dst,
   req->cryptlen, req->iv);
 
if (enc)
-   ret = crypto_skcipher_encrypt(subreq);
+   ret = crypto_skcipher_encrypt(&rctx->fallback_req);
else
-   ret = crypto_skcipher_decrypt(subreq);
-
-   skcipher_request_zero(subreq);
+   ret = crypto_skcipher_decrypt(&rctx->fallback_req);
 
return ret;
 }
@@ -510,24 +510,25 @@ static int mxs_dcp_aes_setkey(struct crypto_skcipher 
*tfm, const u8 *key,
 * but is supported by in-kernel software implementation, we use
 * software fallback.
 */
-   crypto_sync_skcipher_clear_flags(actx->fallback, CRYPTO_TFM_REQ_MASK);
-   crypto_sync_skcipher_set_flags(actx->fallback,
+   crypto_skcipher_clear_flags(actx->fallback, CRYPTO_TFM_REQ_MASK);
+   crypto_skcipher_set_flags(actx->fallback,
  tfm->base.crt_flags & CRYPTO_TFM_REQ_MASK);
-   return crypto_sync_skcipher_setkey(actx->fallback, key, len);
+   return crypto_skcipher_setkey(actx->fallback, key, len);
 }
 
 static int mxs_dcp_aes_fallback_init_tfm(struct crypto_skcipher *tfm)
 {
const char *name = crypto_tfm_alg_name(crypto_skcipher_tfm(tfm));
struct dcp_async_ctx *actx = crypto_skcipher_ctx(tfm);
-   struct crypto_sync_skcipher *blk;
+   struct crypto_skcipher *blk;
 
-   blk = crypto_alloc_sync_skcipher(name, 0, CRYPTO_ALG_NEED_FALLBACK);
+   blk = crypto_alloc_skcipher(name, 0, CRYPTO_ALG_NEED_FALLBACK);
if (IS_ERR(blk))
return PTR_ERR(blk);
 
actx->fallback = blk;
-   crypto_skcipher_set_reqsize(tfm, sizeof(struct dcp_aes_req_ctx));
+   crypto_skcipher_set_reqsize(tfm, sizeof(struct dcp_aes_req_ctx) +
+crypto_skcipher_reqsize(blk));
return 0;
 }
 
@@ -535,7 +536,7 @@ static void mxs_dcp_aes_fallback_exit_tfm(struct 
crypto_skcipher *tfm)
 {
struct dcp_async_ctx *actx = crypto_skcipher_ctx(tfm);
 
-   crypto_free_sync_skcipher(actx->fallback);
+   crypto_free_skcipher(actx->fallback);
 }
 
 /*
-- 
2.27.0



Re: [PATCH v2 4/4] crypto: qat - fallback for xts with 192 bit keys

2020-06-26 Thread Ard Biesheuvel
On Fri, 26 Jun 2020 at 10:04, Giovanni Cabiddu
 wrote:
>
> Forward requests to another provider if the key length is 192 bits as
> this is not supported by the QAT accelerators.
>
> This fixes the following issue reported by the extra self test:
> alg: skcipher: qat_aes_xts setkey failed on test vector "random: len=3204
> klen=48"; expected_error=0, actual_error=-22, flags=0x1
>
> Signed-off-by: Giovanni Cabiddu 
> ---
>  drivers/crypto/qat/qat_common/qat_algs.c | 67 ++--
>  1 file changed, 64 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/crypto/qat/qat_common/qat_algs.c 
> b/drivers/crypto/qat/qat_common/qat_algs.c
> index 77bdff0118f7..5e8c0b6f2834 100644
> --- a/drivers/crypto/qat/qat_common/qat_algs.c
> +++ b/drivers/crypto/qat/qat_common/qat_algs.c
> @@ -88,6 +88,8 @@ struct qat_alg_skcipher_ctx {
> struct icp_qat_fw_la_bulk_req enc_fw_req;
> struct icp_qat_fw_la_bulk_req dec_fw_req;
> struct qat_crypto_instance *inst;
> +   struct crypto_skcipher *ftfm;
> +   bool fallback;
>  };
>
>  static int qat_get_inter_state_size(enum icp_qat_hw_auth_algo qat_hash_alg)
> @@ -994,12 +996,25 @@ static int qat_alg_skcipher_ctr_setkey(struct 
> crypto_skcipher *tfm,
>  static int qat_alg_skcipher_xts_setkey(struct crypto_skcipher *tfm,
>const u8 *key, unsigned int keylen)
>  {
> +   struct qat_alg_skcipher_ctx *ctx = crypto_skcipher_ctx(tfm);
> int ret;
>
> ret = xts_verify_key(tfm, key, keylen);
> if (ret)
> return ret;
>
> +   if (keylen >> 1 == AES_KEYSIZE_192) {
> +   ret = crypto_skcipher_setkey(ctx->ftfm, key, keylen);
> +   if (ret)
> +   return ret;
> +
> +   ctx->fallback = true;
> +
> +   return 0;
> +   }
> +
> +   ctx->fallback = false;
> +
> return qat_alg_skcipher_setkey(tfm, key, keylen,
>ICP_QAT_HW_CIPHER_XTS_MODE);
>  }
> @@ -1066,9 +1081,19 @@ static int qat_alg_skcipher_blk_encrypt(struct 
> skcipher_request *req)
>
>  static int qat_alg_skcipher_xts_encrypt(struct skcipher_request *req)
>  {
> +   struct crypto_skcipher *stfm = crypto_skcipher_reqtfm(req);
> +   struct qat_alg_skcipher_ctx *ctx = crypto_skcipher_ctx(stfm);
> +   struct skcipher_request *nreq = skcipher_request_ctx(req);
> +
> if (req->cryptlen < XTS_BLOCK_SIZE)
> return -EINVAL;
>
> +   if (ctx->fallback) {
> +   memcpy(nreq, req, sizeof(*req));
> +   skcipher_request_set_tfm(nreq, ctx->ftfm);
> +   return crypto_skcipher_encrypt(nreq);
> +   }
> +
> return qat_alg_skcipher_encrypt(req);
>  }
>
> @@ -1134,9 +1159,19 @@ static int qat_alg_skcipher_blk_decrypt(struct 
> skcipher_request *req)
>
>  static int qat_alg_skcipher_xts_decrypt(struct skcipher_request *req)
>  {
> +   struct crypto_skcipher *stfm = crypto_skcipher_reqtfm(req);
> +   struct qat_alg_skcipher_ctx *ctx = crypto_skcipher_ctx(stfm);
> +   struct skcipher_request *nreq = skcipher_request_ctx(req);
> +
> if (req->cryptlen < XTS_BLOCK_SIZE)
> return -EINVAL;
>
> +   if (ctx->fallback) {
> +   memcpy(nreq, req, sizeof(*req));
> +   skcipher_request_set_tfm(nreq, ctx->ftfm);
> +   return crypto_skcipher_decrypt(nreq);
> +   }
> +
> return qat_alg_skcipher_decrypt(req);
>  }
>
> @@ -1200,6 +1235,23 @@ static int qat_alg_skcipher_init_tfm(struct 
> crypto_skcipher *tfm)
> return 0;
>  }
>
> +static int qat_alg_skcipher_init_xts_tfm(struct crypto_skcipher *tfm)
> +{
> +   struct qat_alg_skcipher_ctx *ctx = crypto_skcipher_ctx(tfm);
> +   int reqsize;
> +
> +   ctx->ftfm = crypto_alloc_skcipher("xts(aes)", 0, CRYPTO_ALG_ASYNC);

Why are you only permitting synchronous fallbacks? If the logic above
is sound, and copies the base.complete and base.data fields as well,
the fallback can complete asynchronously without problems.

Note that SIMD s/w implementations of XTS(AES) are asynchronous as
well, as they use the crypto_simd helper which queues requests for
asynchronous completion if the context from which the request was
issued does not permit access to the SIMD register file (e.g., softirq
context on some architectures, if the interrupted context is also
using SIMD)


> +   if (IS_ERR(ctx->ftfm))
> +   return PTR_ERR(ctx->ftfm);
> +
> +   reqsize = max(sizeof(struct qat_crypto_request),
> + sizeof(struct skcipher_request) +
> + crypto_skcipher_reqsize(ctx->ftfm));
> +   crypto_skcipher_set_reqsize(tfm, reqsize);
> +
> +   return 0;
> +}
> +
>  static void qat_alg_skcipher_exit_tfm(struct crypto_skcipher *tfm)
>  {
> struct qat_alg_skcipher_ctx *ctx = crypto_skcipher_ctx(tfm);
> @@ -1227,6 +1279,15 @@ static void qat_alg_skcipher_exit_

Re: [PATCH v2] net: phy: mscc: avoid skcipher API for single block AES encryption

2020-06-25 Thread Ard Biesheuvel
On Thu, 25 Jun 2020 at 21:16, David Miller  wrote:
>
> From: Ard Biesheuvel 
> Date: Thu, 25 Jun 2020 09:18:16 +0200
>
> > The skcipher API dynamically instantiates the transformation object
> > on request that implements the requested algorithm optimally on the
> > given platform. This notion of optimality only matters for cases like
> > bulk network or disk encryption, where performance can be a bottleneck,
> > or in cases where the algorithm itself is not known at compile time.
> >
> > In the mscc case, we are dealing with AES encryption of a single
> > block, and so neither concern applies, and we are better off using
> > the AES library interface, which is lightweight and safe for this
> > kind of use.
> >
> > Note that the scatterlist API does not permit references to buffers
> > that are located on the stack, so the existing code is incorrect in
> > any case, but avoiding the skcipher and scatterlist APIs entirely is
> > the most straight-forward approach to fixing this.
> >
> > Fixes: 28c5107aa904e ("net: phy: mscc: macsec support")
> > Reviewed-by: Eric Biggers 
> > Signed-off-by: Ard Biesheuvel 
>
> Applied and queued up for -stable, thanks.
>
> Please never CC: stable for networking changes, I handle the submissions
> by hand.
>

Noted, thanks.


[PATCH 04/12] crypto: sun4i - permit asynchronous skcipher as fallback

2020-06-25 Thread Ard Biesheuvel
Even though the sun4i driver implements asynchronous versions of ecb(aes)
and cbc(aes), the fallbacks it allocates are required to be synchronous.
Given that SIMD based software implementations are usually asynchronous
as well, even though they rarely complete asynchronously (this typically
only happens in cases where the request was made from softirq context,
while SIMD was already in use in the task context that it interrupted),
these implementations are disregarded, and either the generic C version
or another table based version implemented in assembler is selected
instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c | 46 ++--
 drivers/crypto/allwinner/sun4i-ss/sun4i-ss.h|  3 +-
 2 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c 
b/drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c
index 7f22d305178e..b72de8939497 100644
--- a/drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c
+++ b/drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c
@@ -122,19 +122,17 @@ static int noinline_for_stack 
sun4i_ss_cipher_poll_fallback(struct skcipher_requ
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(areq);
struct sun4i_tfm_ctx *op = crypto_skcipher_ctx(tfm);
struct sun4i_cipher_req_ctx *ctx = skcipher_request_ctx(areq);
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, op->fallback_tfm);
int err;
 
-   skcipher_request_set_sync_tfm(subreq, op->fallback_tfm);
-   skcipher_request_set_callback(subreq, areq->base.flags, NULL,
- NULL);
-   skcipher_request_set_crypt(subreq, areq->src, areq->dst,
+   skcipher_request_set_tfm(&ctx->fallback_req, op->fallback_tfm);
+   skcipher_request_set_callback(&ctx->fallback_req, areq->base.flags,
+ areq->base.complete, areq->base.data);
+   skcipher_request_set_crypt(&ctx->fallback_req, areq->src, areq->dst,
   areq->cryptlen, areq->iv);
if (ctx->mode & SS_DECRYPTION)
-   err = crypto_skcipher_decrypt(subreq);
+   err = crypto_skcipher_decrypt(&ctx->fallback_req);
else
-   err = crypto_skcipher_encrypt(subreq);
-   skcipher_request_zero(subreq);
+   err = crypto_skcipher_encrypt(&ctx->fallback_req);
 
return err;
 }
@@ -494,23 +492,25 @@ int sun4i_ss_cipher_init(struct crypto_tfm *tfm)
alg.crypto.base);
op->ss = algt->ss;
 
-   crypto_skcipher_set_reqsize(__crypto_skcipher_cast(tfm),
-   sizeof(struct sun4i_cipher_req_ctx));
-
-   op->fallback_tfm = crypto_alloc_sync_skcipher(name, 0, 
CRYPTO_ALG_NEED_FALLBACK);
+   op->fallback_tfm = crypto_alloc_skcipher(name, 0, 
CRYPTO_ALG_NEED_FALLBACK);
if (IS_ERR(op->fallback_tfm)) {
dev_err(op->ss->dev, "ERROR: Cannot allocate fallback for %s 
%ld\n",
name, PTR_ERR(op->fallback_tfm));
return PTR_ERR(op->fallback_tfm);
}
 
+   crypto_skcipher_set_reqsize(__crypto_skcipher_cast(tfm),
+   sizeof(struct sun4i_cipher_req_ctx) +
+   crypto_skcipher_reqsize(op->fallback_tfm));
+
+
err = pm_runtime_get_sync(op->ss->dev);
if (err < 0)
goto error_pm;
 
return 0;
 error_pm:
-   crypto_free_sync_skcipher(op->fallback_tfm);
+   crypto_free_skcipher(op->fallback_tfm);
return err;
 }
 
@@ -518,7 +518,7 @@ void sun4i_ss_cipher_exit(struct crypto_tfm *tfm)
 {
struct sun4i_tfm_ctx *op = crypto_tfm_ctx(tfm);
 
-   crypto_free_sync_skcipher(op->fallback_tfm);
+   crypto_free_skcipher(op->fallback_tfm);
pm_runtime_put(op->ss->dev);
 }
 
@@ -546,10 +546,10 @@ int sun4i_ss_aes_setkey(struct crypto_skcipher *tfm, 
const u8 *key,
op->keylen = keylen;
memcpy(op->key, key, keylen);
 
-   crypto_sync_skcipher_clear_flags(op->fallback_tfm, CRYPTO_TFM_REQ_MASK);
-   crypto_sync_skcipher_set_flags(op->fallback_tfm, tfm->base.crt_flags & 
CRYPTO_TFM_REQ_MASK);
+   crypto_skcipher_clear_flags(op->fallback_tfm, CRYPTO_TFM_REQ_MASK);
+   crypto_skcipher_set_flags(op->fallback_tfm, tfm->base.crt_flags & 
CRYPTO_TFM_REQ_MASK);
 
-   return crypto_sync_skcipher_setkey(op->fa

[PATCH 12/12] crypto: sahara - permit asynchronous skcipher as fallback

2020-06-25 Thread Ard Biesheuvel
Even though the sahara driver implements asynchronous versions of
ecb(aes) and cbc(aes), the fallbacks it allocates are required to be
synchronous. Given that SIMD based software implementations are usually
asynchronous as well, even though they rarely complete asynchronously
(this typically only happens in cases where the request was made from
softirq context, while SIMD was already in use in the task context that
it interrupted), these implementations are disregarded, and either the
generic C version or another table based version implemented in assembler
is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/sahara.c | 96 +---
 1 file changed, 45 insertions(+), 51 deletions(-)

diff --git a/drivers/crypto/sahara.c b/drivers/crypto/sahara.c
index 466e30bd529c..0c8cb23ae708 100644
--- a/drivers/crypto/sahara.c
+++ b/drivers/crypto/sahara.c
@@ -146,11 +146,12 @@ struct sahara_ctx {
/* AES-specific context */
int keylen;
u8 key[AES_KEYSIZE_128];
-   struct crypto_sync_skcipher *fallback;
+   struct crypto_skcipher *fallback;
 };
 
 struct sahara_aes_reqctx {
unsigned long mode;
+   struct skcipher_request fallback_req;   // keep at the end
 };
 
 /*
@@ -617,10 +618,10 @@ static int sahara_aes_setkey(struct crypto_skcipher *tfm, 
const u8 *key,
/*
 * The requested key size is not supported by HW, do a fallback.
 */
-   crypto_sync_skcipher_clear_flags(ctx->fallback, CRYPTO_TFM_REQ_MASK);
-   crypto_sync_skcipher_set_flags(ctx->fallback, tfm->base.crt_flags &
+   crypto_skcipher_clear_flags(ctx->fallback, CRYPTO_TFM_REQ_MASK);
+   crypto_skcipher_set_flags(ctx->fallback, tfm->base.crt_flags &
 CRYPTO_TFM_REQ_MASK);
-   return crypto_sync_skcipher_setkey(ctx->fallback, key, keylen);
+   return crypto_skcipher_setkey(ctx->fallback, key, keylen);
 }
 
 static int sahara_aes_crypt(struct skcipher_request *req, unsigned long mode)
@@ -651,21 +652,19 @@ static int sahara_aes_crypt(struct skcipher_request *req, 
unsigned long mode)
 
 static int sahara_aes_ecb_encrypt(struct skcipher_request *req)
 {
+   struct sahara_aes_reqctx *rctx = skcipher_request_ctx(req);
struct sahara_ctx *ctx = crypto_skcipher_ctx(
crypto_skcipher_reqtfm(req));
-   int err;
 
if (unlikely(ctx->keylen != AES_KEYSIZE_128)) {
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, ctx->fallback);
-
-   skcipher_request_set_sync_tfm(subreq, ctx->fallback);
-   skcipher_request_set_callback(subreq, req->base.flags,
- NULL, NULL);
-   skcipher_request_set_crypt(subreq, req->src, req->dst,
-  req->cryptlen, req->iv);
-   err = crypto_skcipher_encrypt(subreq);
-   skcipher_request_zero(subreq);
-   return err;
+   skcipher_request_set_tfm(&rctx->fallback_req, ctx->fallback);
+   skcipher_request_set_callback(&rctx->fallback_req,
+ req->base.flags,
+ req->base.complete,
+ req->base.data);
+   skcipher_request_set_crypt(&rctx->fallback_req, req->src,
+  req->dst, req->cryptlen, req->iv);
+   return crypto_skcipher_encrypt(&rctx->fallback_req);
}
 
return sahara_aes_crypt(req, FLAGS_ENCRYPT);
@@ -673,21 +672,19 @@ static int sahara_aes_ecb_encrypt(struct skcipher_request 
*req)
 
 static int sahara_aes_ecb_decrypt(struct skcipher_request *req)
 {
+   struct sahara_aes_reqctx *rctx = skcipher_request_ctx(req);
struct sahara_ctx *ctx = crypto_skcipher_ctx(
crypto_skcipher_reqtfm(req));
-   int err;
 
if (unlikely(ctx->keylen != AES_KEYSIZE_128)) {
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, ctx->fallback);
-
-   skcipher_request_set_sync_tfm(subreq, ctx->fallback);
-   skcipher_request_set_callback(subreq, req->base.flags,
- NULL, NULL);
-   skcipher_request_set_crypt(subreq, req->src, req->dst,
-  req->cryptlen, req->iv);
-   err = crypto_skcipher_decrypt(subreq);
-   skcipher_request_zero(subreq);
- 

[PATCH 10/12] crypto: picoxcell - permit asynchronous skcipher as fallback

2020-06-25 Thread Ard Biesheuvel
Even though the picoxcell driver implements asynchronous versions of
ecb(aes) and cbc(aes), the fallbacks it allocates are required to be
synchronous. Given that SIMD based software implementations are usually
asynchronous as well, even though they rarely complete asynchronously
(this typically only happens in cases where the request was made from
softirq context, while SIMD was already in use in the task context that
it interrupted), these implementations are disregarded, and either the
generic C version or another table based version implemented in assembler
is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/picoxcell_crypto.c | 34 +++-
 1 file changed, 19 insertions(+), 15 deletions(-)

diff --git a/drivers/crypto/picoxcell_crypto.c 
b/drivers/crypto/picoxcell_crypto.c
index 7384e91c8b32..eea75c7cbdf2 100644
--- a/drivers/crypto/picoxcell_crypto.c
+++ b/drivers/crypto/picoxcell_crypto.c
@@ -86,6 +86,7 @@ struct spacc_req {
dma_addr_t  src_addr, dst_addr;
struct spacc_ddt*src_ddt, *dst_ddt;
void(*complete)(struct spacc_req *req);
+   struct skcipher_request fallback_req;   // keep at the end
 };
 
 struct spacc_aead {
@@ -158,7 +159,7 @@ struct spacc_ablk_ctx {
 * The fallback cipher. If the operation can't be done in hardware,
 * fallback to a software version.
 */
-   struct crypto_sync_skcipher *sw_cipher;
+   struct crypto_skcipher  *sw_cipher;
 };
 
 /* AEAD cipher context. */
@@ -792,13 +793,13 @@ static int spacc_aes_setkey(struct crypto_skcipher 
*cipher, const u8 *key,
 * Set the fallback transform to use the same request flags as
 * the hardware transform.
 */
-   crypto_sync_skcipher_clear_flags(ctx->sw_cipher,
+   crypto_skcipher_clear_flags(ctx->sw_cipher,
CRYPTO_TFM_REQ_MASK);
-   crypto_sync_skcipher_set_flags(ctx->sw_cipher,
+   crypto_skcipher_set_flags(ctx->sw_cipher,
  cipher->base.crt_flags &
  CRYPTO_TFM_REQ_MASK);
 
-   err = crypto_sync_skcipher_setkey(ctx->sw_cipher, key, len);
+   err = crypto_skcipher_setkey(ctx->sw_cipher, key, len);
if (err)
goto sw_setkey_failed;
}
@@ -900,7 +901,7 @@ static int spacc_ablk_do_fallback(struct skcipher_request 
*req,
struct crypto_tfm *old_tfm =
crypto_skcipher_tfm(crypto_skcipher_reqtfm(req));
struct spacc_ablk_ctx *ctx = crypto_tfm_ctx(old_tfm);
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, ctx->sw_cipher);
+   struct spacc_req *dev_req = skcipher_request_ctx(req);
int err;
 
/*
@@ -908,13 +909,13 @@ static int spacc_ablk_do_fallback(struct skcipher_request 
*req,
 * the ciphering has completed, put the old transform back into the
 * request.
 */
-   skcipher_request_set_sync_tfm(subreq, ctx->sw_cipher);
-   skcipher_request_set_callback(subreq, req->base.flags, NULL, NULL);
-   skcipher_request_set_crypt(subreq, req->src, req->dst,
+   skcipher_request_set_tfm(&dev_req->fallback_req, ctx->sw_cipher);
+   skcipher_request_set_callback(&dev_req->fallback_req, req->base.flags,
+ req->base.complete, req->base.data);
+   skcipher_request_set_crypt(&dev_req->fallback_req, req->src, req->dst,
   req->cryptlen, req->iv);
-   err = is_encrypt ? crypto_skcipher_encrypt(subreq) :
-  crypto_skcipher_decrypt(subreq);
-   skcipher_request_zero(subreq);
+   err = is_encrypt ? crypto_skcipher_encrypt(&dev_req->fallback_req) :
+  crypto_skcipher_decrypt(&dev_req->fallback_req);
 
return err;
 }
@@ -1007,19 +1008,22 @@ static int spacc_ablk_init_tfm(struct crypto_skcipher 
*tfm)
ctx->generic.flags = spacc_alg->type;
ctx->generic.engine = engine;
if (alg->base.cra_flags & CRYPTO_ALG_NEED_FALLBACK) {
-   ctx->sw_cipher = crypto_alloc_sync_skcipher(
+   ctx->sw_cipher = crypto_alloc_skcipher(
alg->base.cra_name, 0, CRYPTO_ALG_NEED_FALLBACK);
if (IS_ERR(ctx->sw_cipher)) {
dev_warn(engine->dev, "fa

[PATCH 11/12] crypto: qce - permit asynchronous skcipher as fallback

2020-06-25 Thread Ard Biesheuvel
Even though the qce driver implements asynchronous versions of ecb(aes),
cbc(aes)and xts(aes), the fallbacks it allocates are required to be
synchronous. Given that SIMD based software implementations are usually
asynchronous as well, even though they rarely complete asynchronously
(this typically only happens in cases where the request was made from
softirq context, while SIMD was already in use in the task context that
it interrupted), these implementations are disregarded, and either the
generic C version or another table based version implemented in assembler
is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/qce/cipher.h   |  3 ++-
 drivers/crypto/qce/skcipher.c | 27 ++--
 2 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/drivers/crypto/qce/cipher.h b/drivers/crypto/qce/cipher.h
index 7770660bc853..cffa9fc628ff 100644
--- a/drivers/crypto/qce/cipher.h
+++ b/drivers/crypto/qce/cipher.h
@@ -14,7 +14,7 @@
 struct qce_cipher_ctx {
u8 enc_key[QCE_MAX_KEY_SIZE];
unsigned int enc_keylen;
-   struct crypto_sync_skcipher *fallback;
+   struct crypto_skcipher *fallback;
 };
 
 /**
@@ -43,6 +43,7 @@ struct qce_cipher_reqctx {
struct sg_table src_tbl;
struct scatterlist *src_sg;
unsigned int cryptlen;
+   struct skcipher_request fallback_req;   // keep at the end
 };
 
 static inline struct qce_alg_template *to_cipher_tmpl(struct crypto_skcipher 
*tfm)
diff --git a/drivers/crypto/qce/skcipher.c b/drivers/crypto/qce/skcipher.c
index 9412433f3b21..265afae29901 100644
--- a/drivers/crypto/qce/skcipher.c
+++ b/drivers/crypto/qce/skcipher.c
@@ -178,7 +178,7 @@ static int qce_skcipher_setkey(struct crypto_skcipher 
*ablk, const u8 *key,
break;
}
 
-   ret = crypto_sync_skcipher_setkey(ctx->fallback, key, keylen);
+   ret = crypto_skcipher_setkey(ctx->fallback, key, keylen);
if (!ret)
ctx->enc_keylen = keylen;
return ret;
@@ -235,16 +235,15 @@ static int qce_skcipher_crypt(struct skcipher_request 
*req, int encrypt)
  req->cryptlen <= aes_sw_max_len) ||
 (IS_XTS(rctx->flags) && req->cryptlen > QCE_SECTOR_SIZE &&
  req->cryptlen % QCE_SECTOR_SIZE))) {
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, ctx->fallback);
-
-   skcipher_request_set_sync_tfm(subreq, ctx->fallback);
-   skcipher_request_set_callback(subreq, req->base.flags,
- NULL, NULL);
-   skcipher_request_set_crypt(subreq, req->src, req->dst,
-  req->cryptlen, req->iv);
-   ret = encrypt ? crypto_skcipher_encrypt(subreq) :
-   crypto_skcipher_decrypt(subreq);
-   skcipher_request_zero(subreq);
+   skcipher_request_set_tfm(&rctx->fallback_req, ctx->fallback);
+   skcipher_request_set_callback(&rctx->fallback_req,
+ req->base.flags,
+ req->base.complete,
+ req->base.data);
+   skcipher_request_set_crypt(&rctx->fallback_req, req->src,
+  req->dst, req->cryptlen, req->iv);
+   ret = encrypt ? crypto_skcipher_encrypt(&rctx->fallback_req) :
+   crypto_skcipher_decrypt(&rctx->fallback_req);
return ret;
}
 
@@ -275,8 +274,10 @@ static int qce_skcipher_init_fallback(struct 
crypto_skcipher *tfm)
struct qce_cipher_ctx *ctx = crypto_skcipher_ctx(tfm);
 
qce_skcipher_init(tfm);
-   ctx->fallback = 
crypto_alloc_sync_skcipher(crypto_tfm_alg_name(&tfm->base),
+   ctx->fallback = crypto_alloc_skcipher(crypto_tfm_alg_name(&tfm->base),
   0, CRYPTO_ALG_NEED_FALLBACK);
+   crypto_skcipher_set_reqsize(tfm, sizeof(struct qce_cipher_reqctx) +
+
crypto_skcipher_reqsize(ctx->fallback));
return PTR_ERR_OR_ZERO(ctx->fallback);
 }
 
@@ -284,7 +285,7 @@ static void qce_skcipher_exit(struct crypto_skcipher *tfm)
 {
struct qce_cipher_ctx *ctx = crypto_skcipher_ctx(tfm);
 
-   crypto_free_sync_skcipher(ctx->fallback);
+   crypto_free_skcipher(ctx->fallback);
 }
 
 struct qce_skcipher_def {
-- 
2.27.0



[PATCH 03/12] crypto: omap-aes - permit asynchronous skcipher as fallback

2020-06-25 Thread Ard Biesheuvel
Even though the omap-aes driver implements asynchronous versions of
ecb(aes), cbc(aes) and ctr(aes), the fallbacks it allocates are required
to be synchronous. Given that SIMD based software implementations are
usually asynchronous as well, even though they rarely complete
asynchronously (this typically only happens in cases where the request was
made from softirq context, while SIMD was already in use in the task
context that it interrupted), these implementations are disregarded, and
either the generic C version or another table based version implemented in
assembler is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/omap-aes.c | 35 ++--
 drivers/crypto/omap-aes.h |  3 +-
 2 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/drivers/crypto/omap-aes.c b/drivers/crypto/omap-aes.c
index b5aff20c5900..25154b74dcc6 100644
--- a/drivers/crypto/omap-aes.c
+++ b/drivers/crypto/omap-aes.c
@@ -548,20 +548,18 @@ static int omap_aes_crypt(struct skcipher_request *req, 
unsigned long mode)
  !!(mode & FLAGS_CBC));
 
if (req->cryptlen < aes_fallback_sz) {
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, ctx->fallback);
-
-   skcipher_request_set_sync_tfm(subreq, ctx->fallback);
-   skcipher_request_set_callback(subreq, req->base.flags, NULL,
- NULL);
-   skcipher_request_set_crypt(subreq, req->src, req->dst,
-  req->cryptlen, req->iv);
+   skcipher_request_set_tfm(&rctx->fallback_req, ctx->fallback);
+   skcipher_request_set_callback(&rctx->fallback_req,
+ req->base.flags,
+ req->base.complete,
+ req->base.data);
+   skcipher_request_set_crypt(&rctx->fallback_req, req->src,
+  req->dst, req->cryptlen, req->iv);
 
if (mode & FLAGS_ENCRYPT)
-   ret = crypto_skcipher_encrypt(subreq);
+   ret = crypto_skcipher_encrypt(&rctx->fallback_req);
else
-   ret = crypto_skcipher_decrypt(subreq);
-
-   skcipher_request_zero(subreq);
+   ret = crypto_skcipher_decrypt(&rctx->fallback_req);
return ret;
}
dd = omap_aes_find_dev(rctx);
@@ -590,11 +588,11 @@ static int omap_aes_setkey(struct crypto_skcipher *tfm, 
const u8 *key,
memcpy(ctx->key, key, keylen);
ctx->keylen = keylen;
 
-   crypto_sync_skcipher_clear_flags(ctx->fallback, CRYPTO_TFM_REQ_MASK);
-   crypto_sync_skcipher_set_flags(ctx->fallback, tfm->base.crt_flags &
+   crypto_skcipher_clear_flags(ctx->fallback, CRYPTO_TFM_REQ_MASK);
+   crypto_skcipher_set_flags(ctx->fallback, tfm->base.crt_flags &
 CRYPTO_TFM_REQ_MASK);
 
-   ret = crypto_sync_skcipher_setkey(ctx->fallback, key, keylen);
+   ret = crypto_skcipher_setkey(ctx->fallback, key, keylen);
if (!ret)
return 0;
 
@@ -640,15 +638,16 @@ static int omap_aes_init_tfm(struct crypto_skcipher *tfm)
 {
const char *name = crypto_tfm_alg_name(&tfm->base);
struct omap_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
-   struct crypto_sync_skcipher *blk;
+   struct crypto_skcipher *blk;
 
-   blk = crypto_alloc_sync_skcipher(name, 0, CRYPTO_ALG_NEED_FALLBACK);
+   blk = crypto_alloc_skcipher(name, 0, CRYPTO_ALG_NEED_FALLBACK);
if (IS_ERR(blk))
return PTR_ERR(blk);
 
ctx->fallback = blk;
 
-   crypto_skcipher_set_reqsize(tfm, sizeof(struct omap_aes_reqctx));
+   crypto_skcipher_set_reqsize(tfm, sizeof(struct omap_aes_reqctx) +
+crypto_skcipher_reqsize(blk));
 
ctx->enginectx.op.prepare_request = omap_aes_prepare_req;
ctx->enginectx.op.unprepare_request = NULL;
@@ -662,7 +661,7 @@ static void omap_aes_exit_tfm(struct crypto_skcipher *tfm)
struct omap_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
 
if (ctx->fallback)
-   crypto_free_sync_skcipher(ctx->fallback);
+   crypto_free_skcipher(ctx->fallback);
 
ctx->fallback = NULL;
 }
diff --git a/drivers/crypto/omap-aes.h b/drivers/crypto/omap-aes.h
index 2d111bf906e1..23d073e87bb8 10064

[PATCH 05/12] crypto: sun8i-ce - permit asynchronous skcipher as fallback

2020-06-25 Thread Ard Biesheuvel
Even though the sun8i-ce driver implements asynchronous versions of
ecb(aes) and cbc(aes), the fallbacks it allocates are required to be
synchronous. Given that SIMD based software implementations are usually
asynchronous as well, even though they rarely complete asynchronously
(this typically only happens in cases where the request was made from
softirq context, while SIMD was already in use in the task context that
it interrupted), these implementations are disregarded, and either the
generic C version or another table based version implemented in assembler
is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c | 41 ++--
 drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h|  3 +-
 2 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c 
b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
index a6abb701bfc6..82c99da24dfd 100644
--- a/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
+++ b/drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c
@@ -58,23 +58,20 @@ static int sun8i_ce_cipher_fallback(struct skcipher_request 
*areq)
 #ifdef CONFIG_CRYPTO_DEV_SUN8I_CE_DEBUG
struct skcipher_alg *alg = crypto_skcipher_alg(tfm);
struct sun8i_ce_alg_template *algt;
-#endif
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, op->fallback_tfm);
 
-#ifdef CONFIG_CRYPTO_DEV_SUN8I_CE_DEBUG
algt = container_of(alg, struct sun8i_ce_alg_template, alg.skcipher);
algt->stat_fb++;
 #endif
 
-   skcipher_request_set_sync_tfm(subreq, op->fallback_tfm);
-   skcipher_request_set_callback(subreq, areq->base.flags, NULL, NULL);
-   skcipher_request_set_crypt(subreq, areq->src, areq->dst,
+   skcipher_request_set_tfm(&rctx->fallback_req, op->fallback_tfm);
+   skcipher_request_set_callback(&rctx->fallback_req, areq->base.flags,
+ areq->base.complete, areq->base.data);
+   skcipher_request_set_crypt(&rctx->fallback_req, areq->src, areq->dst,
   areq->cryptlen, areq->iv);
if (rctx->op_dir & CE_DECRYPTION)
-   err = crypto_skcipher_decrypt(subreq);
+   err = crypto_skcipher_decrypt(&rctx->fallback_req);
else
-   err = crypto_skcipher_encrypt(subreq);
-   skcipher_request_zero(subreq);
+   err = crypto_skcipher_encrypt(&rctx->fallback_req);
return err;
 }
 
@@ -335,18 +332,20 @@ int sun8i_ce_cipher_init(struct crypto_tfm *tfm)
algt = container_of(alg, struct sun8i_ce_alg_template, alg.skcipher);
op->ce = algt->ce;
 
-   sktfm->reqsize = sizeof(struct sun8i_cipher_req_ctx);
-
-   op->fallback_tfm = crypto_alloc_sync_skcipher(name, 0, 
CRYPTO_ALG_NEED_FALLBACK);
+   op->fallback_tfm = crypto_alloc_skcipher(name, 0, 
CRYPTO_ALG_NEED_FALLBACK);
if (IS_ERR(op->fallback_tfm)) {
dev_err(op->ce->dev, "ERROR: Cannot allocate fallback for %s 
%ld\n",
name, PTR_ERR(op->fallback_tfm));
return PTR_ERR(op->fallback_tfm);
}
 
+   sktfm->reqsize = sizeof(struct sun8i_cipher_req_ctx) +
+crypto_skcipher_reqsize(op->fallback_tfm);
+
+
dev_info(op->ce->dev, "Fallback for %s is %s\n",
 crypto_tfm_alg_driver_name(&sktfm->base),
-
crypto_tfm_alg_driver_name(crypto_skcipher_tfm(&op->fallback_tfm->base)));
+
crypto_tfm_alg_driver_name(crypto_skcipher_tfm(op->fallback_tfm)));
 
op->enginectx.op.do_one_request = sun8i_ce_handle_cipher_request;
op->enginectx.op.prepare_request = NULL;
@@ -358,7 +357,7 @@ int sun8i_ce_cipher_init(struct crypto_tfm *tfm)
 
return 0;
 error_pm:
-   crypto_free_sync_skcipher(op->fallback_tfm);
+   crypto_free_skcipher(op->fallback_tfm);
return err;
 }
 
@@ -370,7 +369,7 @@ void sun8i_ce_cipher_exit(struct crypto_tfm *tfm)
memzero_explicit(op->key, op->keylen);
kfree(op->key);
}
-   crypto_free_sync_skcipher(op->fallback_tfm);
+   crypto_free_skcipher(op->fallback_tfm);
pm_runtime_put_sync_suspend(op->ce->dev);
 }
 
@@ -400,10 +399,10 @@ int sun8i_ce_aes_setkey(struct crypto_skcipher *tfm, 
const u8 *key,
if (!op->key)
return -ENOMEM;
 
-   crypto_sync_skcipher_clear_flags(op->f

[PATCH 06/12] crypto: sun8i-ss - permit asynchronous skcipher as fallback

2020-06-25 Thread Ard Biesheuvel
Even though the sun8i-ss driver implements asynchronous versions of
ecb(aes) and cbc(aes), the fallbacks it allocates are required to be
synchronous. Given that SIMD based software implementations are usually
asynchronous as well, even though they rarely complete asynchronously
(this typically only happens in cases where the request was made from
softirq context, while SIMD was already in use in the task context that
it interrupted), these implementations are disregarded, and either the
generic C version or another table based version implemented in assembler
is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c | 39 ++--
 drivers/crypto/allwinner/sun8i-ss/sun8i-ss.h|  3 +-
 2 files changed, 22 insertions(+), 20 deletions(-)

diff --git a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c 
b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c
index c89cb2ee2496..7a131675a41c 100644
--- a/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c
+++ b/drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c
@@ -73,7 +73,6 @@ static int sun8i_ss_cipher_fallback(struct skcipher_request 
*areq)
struct sun8i_cipher_req_ctx *rctx = skcipher_request_ctx(areq);
int err;
 
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, op->fallback_tfm);
 #ifdef CONFIG_CRYPTO_DEV_SUN8I_SS_DEBUG
struct skcipher_alg *alg = crypto_skcipher_alg(tfm);
struct sun8i_ss_alg_template *algt;
@@ -81,15 +80,15 @@ static int sun8i_ss_cipher_fallback(struct skcipher_request 
*areq)
algt = container_of(alg, struct sun8i_ss_alg_template, alg.skcipher);
algt->stat_fb++;
 #endif
-   skcipher_request_set_sync_tfm(subreq, op->fallback_tfm);
-   skcipher_request_set_callback(subreq, areq->base.flags, NULL, NULL);
-   skcipher_request_set_crypt(subreq, areq->src, areq->dst,
+   skcipher_request_set_tfm(&rctx->fallback_req, op->fallback_tfm);
+   skcipher_request_set_callback(&rctx->fallback_req, areq->base.flags,
+ areq->base.complete, areq->base.data);
+   skcipher_request_set_crypt(&rctx->fallback_req, areq->src, areq->dst,
   areq->cryptlen, areq->iv);
if (rctx->op_dir & SS_DECRYPTION)
-   err = crypto_skcipher_decrypt(subreq);
+   err = crypto_skcipher_decrypt(&rctx->fallback_req);
else
-   err = crypto_skcipher_encrypt(subreq);
-   skcipher_request_zero(subreq);
+   err = crypto_skcipher_encrypt(&rctx->fallback_req);
return err;
 }
 
@@ -334,18 +333,20 @@ int sun8i_ss_cipher_init(struct crypto_tfm *tfm)
algt = container_of(alg, struct sun8i_ss_alg_template, alg.skcipher);
op->ss = algt->ss;
 
-   sktfm->reqsize = sizeof(struct sun8i_cipher_req_ctx);
-
-   op->fallback_tfm = crypto_alloc_sync_skcipher(name, 0, 
CRYPTO_ALG_NEED_FALLBACK);
+   op->fallback_tfm = crypto_alloc_skcipher(name, 0, 
CRYPTO_ALG_NEED_FALLBACK);
if (IS_ERR(op->fallback_tfm)) {
dev_err(op->ss->dev, "ERROR: Cannot allocate fallback for %s 
%ld\n",
name, PTR_ERR(op->fallback_tfm));
return PTR_ERR(op->fallback_tfm);
}
 
+   sktfm->reqsize = sizeof(struct sun8i_cipher_req_ctx) +
+crypto_skcipher_reqsize(op->fallback_tfm);
+
+
dev_info(op->ss->dev, "Fallback for %s is %s\n",
 crypto_tfm_alg_driver_name(&sktfm->base),
-
crypto_tfm_alg_driver_name(crypto_skcipher_tfm(&op->fallback_tfm->base)));
+
crypto_tfm_alg_driver_name(crypto_skcipher_tfm(op->fallback_tfm)));
 
op->enginectx.op.do_one_request = sun8i_ss_handle_cipher_request;
op->enginectx.op.prepare_request = NULL;
@@ -359,7 +360,7 @@ int sun8i_ss_cipher_init(struct crypto_tfm *tfm)
 
return 0;
 error_pm:
-   crypto_free_sync_skcipher(op->fallback_tfm);
+   crypto_free_skcipher(op->fallback_tfm);
return err;
 }
 
@@ -371,7 +372,7 @@ void sun8i_ss_cipher_exit(struct crypto_tfm *tfm)
memzero_explicit(op->key, op->keylen);
kfree(op->key);
}
-   crypto_free_sync_skcipher(op->fallback_tfm);
+   crypto_free_skcipher(op->fallback_tfm);
pm_runtime_put_sync(op->ss->dev);
 }
 
@@ -401,10 +402,10 @@ int sun8i_ss_aes_setkey(struct crypto_skcipher *tfm, 
const u8

[PATCH 01/12] crypto: amlogic-gxl - default to build as module

2020-06-25 Thread Ard Biesheuvel
The AmLogic GXL crypto accelerator driver is built into the kernel if
ARCH_MESON is set. However, given the single image policy of arm64, its
defconfig enables all platforms by default, and so ARCH_MESON is usually
enabled.

This means that the AmLogic driver causes the arm64 defconfig build to
pull in a huge chunk of the crypto stack as a builtin as well, which is
undesirable, so let's make the amlogic GXL driver default to 'm' instead.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/amlogic/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/amlogic/Kconfig b/drivers/crypto/amlogic/Kconfig
index cf9547602670..cf2c676a7093 100644
--- a/drivers/crypto/amlogic/Kconfig
+++ b/drivers/crypto/amlogic/Kconfig
@@ -1,7 +1,7 @@
 config CRYPTO_DEV_AMLOGIC_GXL
tristate "Support for amlogic cryptographic offloader"
depends on HAS_IOMEM
-   default y if ARCH_MESON
+   default m if ARCH_MESON
select CRYPTO_SKCIPHER
select CRYPTO_ENGINE
select CRYPTO_ECB
-- 
2.27.0



[PATCH 07/12] crypto: ccp - permit asynchronous skcipher as fallback

2020-06-25 Thread Ard Biesheuvel
Even though the ccp driver implements an asynchronous version of xts(aes),
the fallback it allocates is required to be synchronous. Given that SIMD
based software implementations are usually asynchronous as well, even
though they rarely complete asynchronously (this typically only happens
in cases where the request was made from softirq context, while SIMD was
already in use in the task context that it interrupted), these
implementations are disregarded, and either the generic C version or
another table based version implemented in assembler is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/ccp/ccp-crypto-aes-xts.c | 31 ++--
 drivers/crypto/ccp/ccp-crypto.h |  4 ++-
 2 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/drivers/crypto/ccp/ccp-crypto-aes-xts.c 
b/drivers/crypto/ccp/ccp-crypto-aes-xts.c
index 04b2517df955..e0fb4e8f22fb 100644
--- a/drivers/crypto/ccp/ccp-crypto-aes-xts.c
+++ b/drivers/crypto/ccp/ccp-crypto-aes-xts.c
@@ -98,7 +98,7 @@ static int ccp_aes_xts_setkey(struct crypto_skcipher *tfm, 
const u8 *key,
ctx->u.aes.key_len = key_len / 2;
sg_init_one(&ctx->u.aes.key_sg, ctx->u.aes.key, key_len);
 
-   return crypto_sync_skcipher_setkey(ctx->u.aes.tfm_skcipher, key, 
key_len);
+   return crypto_skcipher_setkey(ctx->u.aes.tfm_skcipher, key, key_len);
 }
 
 static int ccp_aes_xts_crypt(struct skcipher_request *req,
@@ -145,20 +145,19 @@ static int ccp_aes_xts_crypt(struct skcipher_request *req,
(ctx->u.aes.key_len != AES_KEYSIZE_256))
fallback = 1;
if (fallback) {
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq,
-  ctx->u.aes.tfm_skcipher);
-
/* Use the fallback to process the request for any
 * unsupported unit sizes or key sizes
 */
-   skcipher_request_set_sync_tfm(subreq, ctx->u.aes.tfm_skcipher);
-   skcipher_request_set_callback(subreq, req->base.flags,
- NULL, NULL);
-   skcipher_request_set_crypt(subreq, req->src, req->dst,
-  req->cryptlen, req->iv);
-   ret = encrypt ? crypto_skcipher_encrypt(subreq) :
-   crypto_skcipher_decrypt(subreq);
-   skcipher_request_zero(subreq);
+   skcipher_request_set_tfm(&rctx->fallback_req,
+ctx->u.aes.tfm_skcipher);
+   skcipher_request_set_callback(&rctx->fallback_req,
+ req->base.flags,
+ req->base.complete,
+ req->base.data);
+   skcipher_request_set_crypt(&rctx->fallback_req, req->src,
+  req->dst, req->cryptlen, req->iv);
+   ret = encrypt ? crypto_skcipher_encrypt(&rctx->fallback_req) :
+   crypto_skcipher_decrypt(&rctx->fallback_req);
return ret;
}
 
@@ -198,13 +197,12 @@ static int ccp_aes_xts_decrypt(struct skcipher_request 
*req)
 static int ccp_aes_xts_init_tfm(struct crypto_skcipher *tfm)
 {
struct ccp_ctx *ctx = crypto_skcipher_ctx(tfm);
-   struct crypto_sync_skcipher *fallback_tfm;
+   struct crypto_skcipher *fallback_tfm;
 
ctx->complete = ccp_aes_xts_complete;
ctx->u.aes.key_len = 0;
 
-   fallback_tfm = crypto_alloc_sync_skcipher("xts(aes)", 0,
-CRYPTO_ALG_ASYNC |
+   fallback_tfm = crypto_alloc_skcipher("xts(aes)", 0,
 CRYPTO_ALG_NEED_FALLBACK);
if (IS_ERR(fallback_tfm)) {
pr_warn("could not load fallback driver xts(aes)\n");
@@ -212,7 +210,8 @@ static int ccp_aes_xts_init_tfm(struct crypto_skcipher *tfm)
}
ctx->u.aes.tfm_skcipher = fallback_tfm;
 
-   crypto_skcipher_set_reqsize(tfm, sizeof(struct ccp_aes_req_ctx));
+   crypto_skcipher_set_reqsize(tfm, sizeof(struct ccp_aes_req_ctx) +
+crypto_skcipher_reqsize(fallback_tfm));
 
return 0;
 }
diff --git a/drivers/crypto/ccp/ccp-crypto.h b/drivers/crypto/ccp/ccp-crypto.h
index 90a009e6b5c1..aed3d2192d01 100644
--- a/drivers/crypto/ccp/ccp-crypto.h
+++ b/drivers/crypto/ccp/ccp-crypto.h
@@ -89,7 +89,7 @@ static inli

[PATCH 08/12] crypto: chelsio - permit asynchronous skcipher as fallback

2020-06-25 Thread Ard Biesheuvel
Even though the chelsio driver implements asynchronous versions of
cbc(aes) and xts(aes), the fallbacks it allocates are required to be
synchronous. Given that SIMD based software implementations are usually
asynchronous as well, even though they rarely complete asynchronously
(this typically only happens in cases where the request was made from
softirq context, while SIMD was already in use in the task context that
it interrupted), these implementations are disregarded, and either the
generic C version or another table based version implemented in assembler
is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/chelsio/chcr_algo.c   | 57 
 drivers/crypto/chelsio/chcr_crypto.h |  3 +-
 2 files changed, 25 insertions(+), 35 deletions(-)

diff --git a/drivers/crypto/chelsio/chcr_algo.c 
b/drivers/crypto/chelsio/chcr_algo.c
index 4c2553672b6f..a6625b90fb1a 100644
--- a/drivers/crypto/chelsio/chcr_algo.c
+++ b/drivers/crypto/chelsio/chcr_algo.c
@@ -690,26 +690,22 @@ static int chcr_sg_ent_in_wr(struct scatterlist *src,
return min(srclen, dstlen);
 }
 
-static int chcr_cipher_fallback(struct crypto_sync_skcipher *cipher,
-   u32 flags,
-   struct scatterlist *src,
-   struct scatterlist *dst,
-   unsigned int nbytes,
+static int chcr_cipher_fallback(struct crypto_skcipher *cipher,
+   struct skcipher_request *req,
u8 *iv,
unsigned short op_type)
 {
+   struct chcr_skcipher_req_ctx *reqctx = skcipher_request_ctx(req);
int err;
 
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, cipher);
-
-   skcipher_request_set_sync_tfm(subreq, cipher);
-   skcipher_request_set_callback(subreq, flags, NULL, NULL);
-   skcipher_request_set_crypt(subreq, src, dst,
-  nbytes, iv);
+   skcipher_request_set_tfm(&reqctx->fallback_req, cipher);
+   skcipher_request_set_callback(&reqctx->fallback_req, req->base.flags,
+ req->base.complete, req->base.data);
+   skcipher_request_set_crypt(&reqctx->fallback_req, req->src, req->dst,
+  req->cryptlen, iv);
 
-   err = op_type ? crypto_skcipher_decrypt(subreq) :
-   crypto_skcipher_encrypt(subreq);
-   skcipher_request_zero(subreq);
+   err = op_type ? crypto_skcipher_decrypt(&reqctx->fallback_req) :
+   crypto_skcipher_encrypt(&reqctx->fallback_req);
 
return err;
 
@@ -924,11 +920,11 @@ static int chcr_cipher_fallback_setkey(struct 
crypto_skcipher *cipher,
 {
struct ablk_ctx *ablkctx = ABLK_CTX(c_ctx(cipher));
 
-   crypto_sync_skcipher_clear_flags(ablkctx->sw_cipher,
+   crypto_skcipher_clear_flags(ablkctx->sw_cipher,
CRYPTO_TFM_REQ_MASK);
-   crypto_sync_skcipher_set_flags(ablkctx->sw_cipher,
+   crypto_skcipher_set_flags(ablkctx->sw_cipher,
cipher->base.crt_flags & CRYPTO_TFM_REQ_MASK);
-   return crypto_sync_skcipher_setkey(ablkctx->sw_cipher, key, keylen);
+   return crypto_skcipher_setkey(ablkctx->sw_cipher, key, keylen);
 }
 
 static int chcr_aes_cbc_setkey(struct crypto_skcipher *cipher,
@@ -1206,13 +1202,8 @@ static int chcr_handle_cipher_resp(struct 
skcipher_request *req,
  req);
memcpy(req->iv, reqctx->init_iv, IV);
atomic_inc(&adap->chcr_stats.fallback);
-   err = chcr_cipher_fallback(ablkctx->sw_cipher,
-req->base.flags,
-req->src,
-req->dst,
-req->cryptlen,
-req->iv,
-reqctx->op);
+   err = chcr_cipher_fallback(ablkctx->sw_cipher, req, req->iv,
+  reqctx->op);
goto complete;
}
 
@@ -1341,11 +1332,7 @@ static int process_cipher(struct skcipher_request *req,
chcr_cipher_dma_unmap(&ULD_CTX(c_ctx(tfm))->lldi.pdev->dev,
  req);
 fallback:   atomic_inc(&adap->chcr_stats.fallback);
-   err = chcr_cipher_fallback(ablkctx->sw_cipher,
- 

[PATCH 09/12] crypto: mxs-dcp - permit asynchronous skcipher as fallback

2020-06-25 Thread Ard Biesheuvel
Even though the mxs-dcp driver implements asynchronous versions of
ecb(aes) and cbc(aes), the fallbacks it allocates are required to be
synchronous. Given that SIMD based software implementations are usually
asynchronous as well, even though they rarely complete asynchronously
(this typically only happens in cases where the request was made from
softirq context, while SIMD was already in use in the task context that
it interrupted), these implementations are disregarded, and either the
generic C version or another table based version implemented in assembler
is selected instead.

Since falling back to synchronous AES is not only a performance issue, but
potentially a security issue as well (due to the fact that table based AES
is not time invariant), let's fix this, by allocating an ordinary skcipher
as the fallback, and invoke it with the completion routine that was given
to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/mxs-dcp.c | 33 ++--
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/drivers/crypto/mxs-dcp.c b/drivers/crypto/mxs-dcp.c
index d84530293036..909a7eb748e3 100644
--- a/drivers/crypto/mxs-dcp.c
+++ b/drivers/crypto/mxs-dcp.c
@@ -97,7 +97,7 @@ struct dcp_async_ctx {
unsigned inthot:1;
 
/* Crypto-specific context */
-   struct crypto_sync_skcipher *fallback;
+   struct crypto_skcipher  *fallback;
unsigned intkey_len;
uint8_t key[AES_KEYSIZE_128];
 };
@@ -105,6 +105,7 @@ struct dcp_async_ctx {
 struct dcp_aes_req_ctx {
unsigned intenc:1;
unsigned intecb:1;
+   struct skcipher_request fallback_req;   // keep at the end
 };
 
 struct dcp_sha_req_ctx {
@@ -426,21 +427,20 @@ static int dcp_chan_thread_aes(void *data)
 static int mxs_dcp_block_fallback(struct skcipher_request *req, int enc)
 {
struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
+   struct dcp_aes_req_ctx *rctx = skcipher_request_ctx(req);
struct dcp_async_ctx *ctx = crypto_skcipher_ctx(tfm);
-   SYNC_SKCIPHER_REQUEST_ON_STACK(subreq, ctx->fallback);
int ret;
 
-   skcipher_request_set_sync_tfm(subreq, ctx->fallback);
-   skcipher_request_set_callback(subreq, req->base.flags, NULL, NULL);
-   skcipher_request_set_crypt(subreq, req->src, req->dst,
+   skcipher_request_set_tfm(&rctx->fallback_req, ctx->fallback);
+   skcipher_request_set_callback(&rctx->fallback_req, req->base.flags,
+ req->base.complete, req->base.data);
+   skcipher_request_set_crypt(&rctx->fallback_req, req->src, req->dst,
   req->cryptlen, req->iv);
 
if (enc)
-   ret = crypto_skcipher_encrypt(subreq);
+   ret = crypto_skcipher_encrypt(&rctx->fallback_req);
else
-   ret = crypto_skcipher_decrypt(subreq);
-
-   skcipher_request_zero(subreq);
+   ret = crypto_skcipher_decrypt(&rctx->fallback_req);
 
return ret;
 }
@@ -510,24 +510,25 @@ static int mxs_dcp_aes_setkey(struct crypto_skcipher 
*tfm, const u8 *key,
 * but is supported by in-kernel software implementation, we use
 * software fallback.
 */
-   crypto_sync_skcipher_clear_flags(actx->fallback, CRYPTO_TFM_REQ_MASK);
-   crypto_sync_skcipher_set_flags(actx->fallback,
+   crypto_skcipher_clear_flags(actx->fallback, CRYPTO_TFM_REQ_MASK);
+   crypto_skcipher_set_flags(actx->fallback,
  tfm->base.crt_flags & CRYPTO_TFM_REQ_MASK);
-   return crypto_sync_skcipher_setkey(actx->fallback, key, len);
+   return crypto_skcipher_setkey(actx->fallback, key, len);
 }
 
 static int mxs_dcp_aes_fallback_init_tfm(struct crypto_skcipher *tfm)
 {
const char *name = crypto_tfm_alg_name(crypto_skcipher_tfm(tfm));
struct dcp_async_ctx *actx = crypto_skcipher_ctx(tfm);
-   struct crypto_sync_skcipher *blk;
+   struct crypto_skcipher *blk;
 
-   blk = crypto_alloc_sync_skcipher(name, 0, CRYPTO_ALG_NEED_FALLBACK);
+   blk = crypto_alloc_skcipher(name, 0, CRYPTO_ALG_NEED_FALLBACK);
if (IS_ERR(blk))
return PTR_ERR(blk);
 
actx->fallback = blk;
-   crypto_skcipher_set_reqsize(tfm, sizeof(struct dcp_aes_req_ctx));
+   crypto_skcipher_set_reqsize(tfm, sizeof(struct dcp_aes_req_ctx) +
+crypto_skcipher_reqsize(blk));
return 0;
 }
 
@@ -535,7 +536,7 @@ static void mxs_dcp_aes_fallback_exit_tfm(struct 
crypto_skcipher *tfm)
 {
struct dcp_async_ctx *actx = crypto_skcipher_ctx(tfm);
 
-   crypto_free_sync_skcipher(actx->fallback);
+   crypto_free_skcipher(actx->fallback);
 }
 
 /*
-- 
2.27.0



[PATCH 02/12] crypto: amlogic-gxl - permit async skcipher as fallback

2020-06-25 Thread Ard Biesheuvel
Even though the amlogic-gxl driver implements asynchronous versions of
ecb(aes) and cbc(aes), the fallbacks it allocates are required to be
synchronous. Given that SIMD based software implementations are usually
asynchronous as well, even though they rarely complete asynchronously
(this typically only happens in cases where the request was made from
softirq context, while SIMD was already in use in the task context that
it interrupted), these implementations are disregarded, and either the
generic C version or another table based version implemented in assembler
is selected instead.

Since falling back to synchronous AES is not only a performance issue,
but potentially a security issue as well (due to the fact that table
based AES is not time invariant), let's fix this, by allocating an
ordinary skcipher as the fallback, and invoke it with the completion
routine that was given to the outer request.

Signed-off-by: Ard Biesheuvel 
---
 drivers/crypto/amlogic/amlogic-gxl-cipher.c | 27 ++--
 drivers/crypto/amlogic/amlogic-gxl.h|  3 ++-
 2 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/crypto/amlogic/amlogic-gxl-cipher.c 
b/drivers/crypto/amlogic/amlogic-gxl-cipher.c
index 9819dd50fbad..5880b94dcb32 100644
--- a/drivers/crypto/amlogic/amlogic-gxl-cipher.c
+++ b/drivers/crypto/amlogic/amlogic-gxl-cipher.c
@@ -64,22 +64,20 @@ static int meson_cipher_do_fallback(struct skcipher_request 
*areq)
 #ifdef CONFIG_CRYPTO_DEV_AMLOGIC_GXL_DEBUG
struct skcipher_alg *alg = crypto_skcipher_alg(tfm);
struct meson_alg_template *algt;
-#endif
-   SYNC_SKCIPHER_REQUEST_ON_STACK(req, op->fallback_tfm);
 
-#ifdef CONFIG_CRYPTO_DEV_AMLOGIC_GXL_DEBUG
algt = container_of(alg, struct meson_alg_template, alg.skcipher);
algt->stat_fb++;
 #endif
-   skcipher_request_set_sync_tfm(req, op->fallback_tfm);
-   skcipher_request_set_callback(req, areq->base.flags, NULL, NULL);
-   skcipher_request_set_crypt(req, areq->src, areq->dst,
+   skcipher_request_set_tfm(&rctx->fallback_req, op->fallback_tfm);
+   skcipher_request_set_callback(&rctx->fallback_req, areq->base.flags,
+ areq->base.complete, areq->base.data);
+   skcipher_request_set_crypt(&rctx->fallback_req, areq->src, areq->dst,
   areq->cryptlen, areq->iv);
+
if (rctx->op_dir == MESON_DECRYPT)
-   err = crypto_skcipher_decrypt(req);
+   err = crypto_skcipher_decrypt(&rctx->fallback_req);
else
-   err = crypto_skcipher_encrypt(req);
-   skcipher_request_zero(req);
+   err = crypto_skcipher_encrypt(&rctx->fallback_req);
return err;
 }
 
@@ -321,15 +319,16 @@ int meson_cipher_init(struct crypto_tfm *tfm)
algt = container_of(alg, struct meson_alg_template, alg.skcipher);
op->mc = algt->mc;
 
-   sktfm->reqsize = sizeof(struct meson_cipher_req_ctx);
-
-   op->fallback_tfm = crypto_alloc_sync_skcipher(name, 0, 
CRYPTO_ALG_NEED_FALLBACK);
+   op->fallback_tfm = crypto_alloc_skcipher(name, 0, 
CRYPTO_ALG_NEED_FALLBACK);
if (IS_ERR(op->fallback_tfm)) {
dev_err(op->mc->dev, "ERROR: Cannot allocate fallback for %s 
%ld\n",
name, PTR_ERR(op->fallback_tfm));
return PTR_ERR(op->fallback_tfm);
}
 
+   sktfm->reqsize = sizeof(struct meson_cipher_req_ctx) +
+crypto_skcipher_reqsize(op->fallback_tfm);
+
op->enginectx.op.do_one_request = meson_handle_cipher_request;
op->enginectx.op.prepare_request = NULL;
op->enginectx.op.unprepare_request = NULL;
@@ -345,7 +344,7 @@ void meson_cipher_exit(struct crypto_tfm *tfm)
memzero_explicit(op->key, op->keylen);
kfree(op->key);
}
-   crypto_free_sync_skcipher(op->fallback_tfm);
+   crypto_free_skcipher(op->fallback_tfm);
 }
 
 int meson_aes_setkey(struct crypto_skcipher *tfm, const u8 *key,
@@ -377,5 +376,5 @@ int meson_aes_setkey(struct crypto_skcipher *tfm, const u8 
*key,
if (!op->key)
return -ENOMEM;
 
-   return crypto_sync_skcipher_setkey(op->fallback_tfm, key, keylen);
+   return crypto_skcipher_setkey(op->fallback_tfm, key, keylen);
 }
diff --git a/drivers/crypto/amlogic/amlogic-gxl.h 
b/drivers/crypto/amlogic/amlogic-gxl.h
index b7f2de91ab76..dc0f142324a3 100644
--- a/drivers/crypto/amlogic/amlogic-gxl.h
+++ b/drivers/crypto/amlogic/amlogic-gxl.h
@@ -109,6 +109,7 @@ struct meson_dev {
 struct meson_cipher_req_ctx {
u32 op_dir;
int flow;
+   struct skcipher_request fallback_req;   // keep at the end
 };
 
 /*
@@ -126,7 +127,7 @@ struct meson_cipher_tfm_ctx {
u32 keylen;
u32 keymode;
struct meson_dev *mc;
-   struct crypto_sync_skcipher *fallback_tfm;
+   struct crypto_skcipher *fallback_tfm;
 };
 
 /*
-- 
2.27.0



[PATCH 00/12] crypto: permit asynchronous skciphers as driver fallbacks

2020-06-25 Thread Ard Biesheuvel
The drivers for crypto accelerators in drivers/crypto all implement skciphers
of an asynchronous nature, given that they are backed by hardware DMA that
completes asynchronously wrt the execution flow.

However, in many cases, any fallbacks they allocate are limited to the
synchronous variety, which rules out the use of SIMD implementations of
AES in ECB, CBC and XTS modes, given that they are usually built on top
of the asynchronous SIMD helper, which queues requests for asynchronous
completion if they are issued from a context that does not permit the use
of the SIMD register file.

This may result in sub-optimal AES implementations to be selected as
fallbacks, or even less secure ones if the only synchronous alternative
is table based, and therefore not time invariant.

So switch all these cases over to the asynchronous API, by moving the
subrequest into the skcipher request context, and permitting it to
complete asynchronously via the caller provided completion function.

Patch #1 is not related, but touches the same driver as #2 so it is
included anyway.

Only OMAP was tested on actual hardware - the others are build tested only.

Cc: Corentin Labbe 
Cc: Herbert Xu 
Cc: "David S. Miller" 
Cc: Maxime Ripard 
Cc: Chen-Yu Tsai 
Cc: Tom Lendacky 
Cc: Ayush Sawal 
Cc: Vinay Kumar Yadav 
Cc: Rohit Maheshwari 
Cc: Shawn Guo 
Cc: Sascha Hauer 
Cc: Pengutronix Kernel Team 
Cc: Fabio Estevam 
Cc: NXP Linux Team 
Cc: Jamie Iles 
Cc: Eric Biggers 

Ard Biesheuvel (12):
  crypto: amlogic-gxl - default to build as module
  crypto: amlogic-gxl - permit async skcipher as fallback
  crypto: omap-aes - permit asynchronous skcipher as fallback
  crypto: sun4i - permit asynchronous skcipher as fallback
  crypto: sun8i-ce - permit asynchronous skcipher as fallback
  crypto: sun8i-ss - permit asynchronous skcipher as fallback
  crypto: ccp - permit asynchronous skcipher as fallback
  crypto: chelsio - permit asynchronous skcipher as fallback
  crypto: mxs-dcp - permit asynchronous skcipher as fallback
  crypto: picoxcell - permit asynchronous skcipher as fallback
  crypto: qce - permit asynchronous skcipher as fallback
  crypto: sahara - permit asynchronous skcipher as fallback

 drivers/crypto/allwinner/sun4i-ss/sun4i-ss-cipher.c | 46 +-
 drivers/crypto/allwinner/sun4i-ss/sun4i-ss.h|  3 +-
 drivers/crypto/allwinner/sun8i-ce/sun8i-ce-cipher.c | 41 -
 drivers/crypto/allwinner/sun8i-ce/sun8i-ce.h|  3 +-
 drivers/crypto/allwinner/sun8i-ss/sun8i-ss-cipher.c | 39 
 drivers/crypto/allwinner/sun8i-ss/sun8i-ss.h|  3 +-
 drivers/crypto/amlogic/Kconfig  |  2 +-
 drivers/crypto/amlogic/amlogic-gxl-cipher.c | 27 +++---
 drivers/crypto/amlogic/amlogic-gxl.h|  3 +-
 drivers/crypto/ccp/ccp-crypto-aes-xts.c | 31 +++
 drivers/crypto/ccp/ccp-crypto.h |  4 +-
 drivers/crypto/chelsio/chcr_algo.c  | 57 +---
 drivers/crypto/chelsio/chcr_crypto.h|  3 +-
 drivers/crypto/mxs-dcp.c| 33 +++
 drivers/crypto/omap-aes.c   | 35 ---
 drivers/crypto/omap-aes.h   |  3 +-
 drivers/crypto/picoxcell_crypto.c   | 34 ---
 drivers/crypto/qce/cipher.h |  3 +-
 drivers/crypto/qce/skcipher.c   | 27 +++---
 drivers/crypto/sahara.c | 96 +---
 20 files changed, 244 insertions(+), 249 deletions(-)

-- 
2.27.0



[PATCH v2] net: phy: mscc: avoid skcipher API for single block AES encryption

2020-06-25 Thread Ard Biesheuvel
The skcipher API dynamically instantiates the transformation object
on request that implements the requested algorithm optimally on the
given platform. This notion of optimality only matters for cases like
bulk network or disk encryption, where performance can be a bottleneck,
or in cases where the algorithm itself is not known at compile time.

In the mscc case, we are dealing with AES encryption of a single
block, and so neither concern applies, and we are better off using
the AES library interface, which is lightweight and safe for this
kind of use.

Note that the scatterlist API does not permit references to buffers
that are located on the stack, so the existing code is incorrect in
any case, but avoiding the skcipher and scatterlist APIs entirely is
the most straight-forward approach to fixing this.

Cc: Antoine Tenart 
Cc: Andrew Lunn 
Cc: Florian Fainelli 
Cc: Heiner Kallweit 
Cc: "David S. Miller" 
Cc: Jakub Kicinski 
Cc: 
Fixes: 28c5107aa904e ("net: phy: mscc: macsec support")
Reviewed-by: Eric Biggers 
Signed-off-by: Ard Biesheuvel 
---
v2:
- select CRYPTO_LIB_AES only if MACSEC is enabled
- add Eric's R-b

 drivers/net/phy/Kconfig|  3 +-
 drivers/net/phy/mscc/mscc_macsec.c | 40 +---
 2 files changed, 10 insertions(+), 33 deletions(-)

diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index f25702386d83..e351d65533aa 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -480,8 +480,7 @@ config MICROCHIP_T1_PHY
 config MICROSEMI_PHY
tristate "Microsemi PHYs"
depends on MACSEC || MACSEC=n
-   select CRYPTO_AES
-   select CRYPTO_ECB
+   select CRYPTO_LIB_AES if MACSEC
help
  Currently supports VSC8514, VSC8530, VSC8531, VSC8540 and VSC8541 PHYs
 
diff --git a/drivers/net/phy/mscc/mscc_macsec.c 
b/drivers/net/phy/mscc/mscc_macsec.c
index b4d3dc4068e2..d53ca884b5c9 100644
--- a/drivers/net/phy/mscc/mscc_macsec.c
+++ b/drivers/net/phy/mscc/mscc_macsec.c
@@ -10,7 +10,7 @@
 #include 
 #include 
 
-#include 
+#include 
 
 #include 
 
@@ -500,39 +500,17 @@ static u32 vsc8584_macsec_flow_context_id(struct 
macsec_flow *flow)
 static int vsc8584_macsec_derive_key(const u8 key[MACSEC_KEYID_LEN],
 u16 key_len, u8 hkey[16])
 {
-   struct crypto_skcipher *tfm = crypto_alloc_skcipher("ecb(aes)", 0, 0);
-   struct skcipher_request *req = NULL;
-   struct scatterlist src, dst;
-   DECLARE_CRYPTO_WAIT(wait);
-   u32 input[4] = {0};
+   const u8 input[AES_BLOCK_SIZE] = {0};
+   struct crypto_aes_ctx ctx;
int ret;
 
-   if (IS_ERR(tfm))
-   return PTR_ERR(tfm);
-
-   req = skcipher_request_alloc(tfm, GFP_KERNEL);
-   if (!req) {
-   ret = -ENOMEM;
-   goto out;
-   }
-
-   skcipher_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG |
- CRYPTO_TFM_REQ_MAY_SLEEP, crypto_req_done,
- &wait);
-   ret = crypto_skcipher_setkey(tfm, key, key_len);
-   if (ret < 0)
-   goto out;
-
-   sg_init_one(&src, input, 16);
-   sg_init_one(&dst, hkey, 16);
-   skcipher_request_set_crypt(req, &src, &dst, 16, NULL);
-
-   ret = crypto_wait_req(crypto_skcipher_encrypt(req), &wait);
+   ret = aes_expandkey(&ctx, key, key_len);
+   if (ret)
+   return ret;
 
-out:
-   skcipher_request_free(req);
-   crypto_free_skcipher(tfm);
-   return ret;
+   aes_encrypt(&ctx, hkey, input);
+   memzero_explicit(&ctx, sizeof(ctx));
+   return 0;
 }
 
 static int vsc8584_macsec_transformation(struct phy_device *phydev,
-- 
2.27.0



Re: [PATCH] net: phy: mscc: avoid skcipher API for single block AES encryption

2020-06-24 Thread Ard Biesheuvel
On Wed, 24 Jun 2020 at 18:32, Eric Biggers  wrote:
>
> On Wed, Jun 24, 2020 at 03:34:27PM +0200, Ard Biesheuvel wrote:
> > The skcipher API dynamically instantiates the transformation object on
> > request that implements the requested algorithm optimally on the given
> > platform. This notion of optimality only matters for cases like bulk
> > network or disk encryption, where performance can be a bottleneck, or
> > in cases where the algorithm itself is not known at compile time.
> >
> > In the mscc macsec case, we are dealing with AES encryption of a single
> > block, and so neither concern applies, and we are better off using the
> > AES library interface, which is lightweight and safe for this kind of
> > use.
> >
> > Note that the scatterlist API does not permit references to buffers that
> > are located on the stack, so the existing code is incorrect in any case,
> > but avoiding the skcipher and scatterlist APIs altogether is the most
> > straight-forward approach to fixing this.
> >
> > Cc: Antoine Tenart 
> > Cc: Andrew Lunn 
> > Cc: Florian Fainelli 
> > Cc: Heiner Kallweit 
> > Cc: "David S. Miller" 
> > Cc: Jakub Kicinski 
> > Cc: 
> > Fixes: 28c5107aa904e ("net: phy: mscc: macsec support")
> > Signed-off-by: Ard Biesheuvel 
> > ---
> >  drivers/net/phy/Kconfig|  3 +-
> >  drivers/net/phy/mscc/mscc_macsec.c | 40 +---
> >  2 files changed, 10 insertions(+), 33 deletions(-)
> >
> > diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
> > index f25702386d83..e9c05848ec52 100644
> > --- a/drivers/net/phy/Kconfig
> > +++ b/drivers/net/phy/Kconfig
> > @@ -480,8 +480,7 @@ config MICROCHIP_T1_PHY
> >  config MICROSEMI_PHY
> >   tristate "Microsemi PHYs"
> >   depends on MACSEC || MACSEC=n
> > - select CRYPTO_AES
> > - select CRYPTO_ECB
> > + select CRYPTO_LIB_AES
> >   help
> > Currently supports VSC8514, VSC8530, VSC8531, VSC8540 and VSC8541 
> > PHYs
>
> Shouldn't it be 'select CRYPTO_LIB_AES if MACSEC', since
> mscc_macsec.c is only compiled if MACSEC?
>

Good point, I'll change that.

> >
> > diff --git a/drivers/net/phy/mscc/mscc_macsec.c 
> > b/drivers/net/phy/mscc/mscc_macsec.c
> > index b4d3dc4068e2..d53ca884b5c9 100644
> > --- a/drivers/net/phy/mscc/mscc_macsec.c
> > +++ b/drivers/net/phy/mscc/mscc_macsec.c
> > @@ -10,7 +10,7 @@
> >  #include 
> >  #include 
> >
> > -#include 
> > +#include 
> >
> >  #include 
> >
> > @@ -500,39 +500,17 @@ static u32 vsc8584_macsec_flow_context_id(struct 
> > macsec_flow *flow)
> >  static int vsc8584_macsec_derive_key(const u8 key[MACSEC_KEYID_LEN],
> >u16 key_len, u8 hkey[16])
> >  {
> > - struct crypto_skcipher *tfm = crypto_alloc_skcipher("ecb(aes)", 0, 0);
> > - struct skcipher_request *req = NULL;
> > - struct scatterlist src, dst;
> > - DECLARE_CRYPTO_WAIT(wait);
> > - u32 input[4] = {0};
> > + const u8 input[AES_BLOCK_SIZE] = {0};
> > + struct crypto_aes_ctx ctx;
> >   int ret;
> >
> > - if (IS_ERR(tfm))
> > - return PTR_ERR(tfm);
> > -
> > - req = skcipher_request_alloc(tfm, GFP_KERNEL);
> > - if (!req) {
> > - ret = -ENOMEM;
> > - goto out;
> > - }
> > -
> > - skcipher_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG |
> > -   CRYPTO_TFM_REQ_MAY_SLEEP, 
> > crypto_req_done,
> > -   &wait);
> > - ret = crypto_skcipher_setkey(tfm, key, key_len);
> > - if (ret < 0)
> > - goto out;
> > -
> > - sg_init_one(&src, input, 16);
> > - sg_init_one(&dst, hkey, 16);
> > - skcipher_request_set_crypt(req, &src, &dst, 16, NULL);
> > -
> > - ret = crypto_wait_req(crypto_skcipher_encrypt(req), &wait);
> > + ret = aes_expandkey(&ctx, key, key_len);
> > + if (ret)
> > + return ret;
> >
> > -out:
> > - skcipher_request_free(req);
> > - crypto_free_skcipher(tfm);
> > - return ret;
> > + aes_encrypt(&ctx, hkey, input);
> > + memzero_explicit(&ctx, sizeof(ctx));
> > + return 0;
> >  }
> >
>
> Otherwise this looks good.  You can add:
>
> Reviewed-by: Eric Biggers 
>

Thanks


[PATCH] net: phy: mscc: avoid skcipher API for single block AES encryption

2020-06-24 Thread Ard Biesheuvel
The skcipher API dynamically instantiates the transformation object on
request that implements the requested algorithm optimally on the given
platform. This notion of optimality only matters for cases like bulk
network or disk encryption, where performance can be a bottleneck, or
in cases where the algorithm itself is not known at compile time.

In the mscc macsec case, we are dealing with AES encryption of a single
block, and so neither concern applies, and we are better off using the
AES library interface, which is lightweight and safe for this kind of
use.

Note that the scatterlist API does not permit references to buffers that
are located on the stack, so the existing code is incorrect in any case,
but avoiding the skcipher and scatterlist APIs altogether is the most
straight-forward approach to fixing this.

Cc: Antoine Tenart 
Cc: Andrew Lunn 
Cc: Florian Fainelli 
Cc: Heiner Kallweit 
Cc: "David S. Miller" 
Cc: Jakub Kicinski 
Cc: 
Fixes: 28c5107aa904e ("net: phy: mscc: macsec support")
Signed-off-by: Ard Biesheuvel 
---
 drivers/net/phy/Kconfig|  3 +-
 drivers/net/phy/mscc/mscc_macsec.c | 40 +---
 2 files changed, 10 insertions(+), 33 deletions(-)

diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index f25702386d83..e9c05848ec52 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -480,8 +480,7 @@ config MICROCHIP_T1_PHY
 config MICROSEMI_PHY
tristate "Microsemi PHYs"
depends on MACSEC || MACSEC=n
-   select CRYPTO_AES
-   select CRYPTO_ECB
+   select CRYPTO_LIB_AES
help
  Currently supports VSC8514, VSC8530, VSC8531, VSC8540 and VSC8541 PHYs
 
diff --git a/drivers/net/phy/mscc/mscc_macsec.c 
b/drivers/net/phy/mscc/mscc_macsec.c
index b4d3dc4068e2..d53ca884b5c9 100644
--- a/drivers/net/phy/mscc/mscc_macsec.c
+++ b/drivers/net/phy/mscc/mscc_macsec.c
@@ -10,7 +10,7 @@
 #include 
 #include 
 
-#include 
+#include 
 
 #include 
 
@@ -500,39 +500,17 @@ static u32 vsc8584_macsec_flow_context_id(struct 
macsec_flow *flow)
 static int vsc8584_macsec_derive_key(const u8 key[MACSEC_KEYID_LEN],
 u16 key_len, u8 hkey[16])
 {
-   struct crypto_skcipher *tfm = crypto_alloc_skcipher("ecb(aes)", 0, 0);
-   struct skcipher_request *req = NULL;
-   struct scatterlist src, dst;
-   DECLARE_CRYPTO_WAIT(wait);
-   u32 input[4] = {0};
+   const u8 input[AES_BLOCK_SIZE] = {0};
+   struct crypto_aes_ctx ctx;
int ret;
 
-   if (IS_ERR(tfm))
-   return PTR_ERR(tfm);
-
-   req = skcipher_request_alloc(tfm, GFP_KERNEL);
-   if (!req) {
-   ret = -ENOMEM;
-   goto out;
-   }
-
-   skcipher_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG |
- CRYPTO_TFM_REQ_MAY_SLEEP, crypto_req_done,
- &wait);
-   ret = crypto_skcipher_setkey(tfm, key, key_len);
-   if (ret < 0)
-   goto out;
-
-   sg_init_one(&src, input, 16);
-   sg_init_one(&dst, hkey, 16);
-   skcipher_request_set_crypt(req, &src, &dst, 16, NULL);
-
-   ret = crypto_wait_req(crypto_skcipher_encrypt(req), &wait);
+   ret = aes_expandkey(&ctx, key, key_len);
+   if (ret)
+   return ret;
 
-out:
-   skcipher_request_free(req);
-   crypto_free_skcipher(tfm);
-   return ret;
+   aes_encrypt(&ctx, hkey, input);
+   memzero_explicit(&ctx, sizeof(ctx));
+   return 0;
 }
 
 static int vsc8584_macsec_transformation(struct phy_device *phydev,
-- 
2.27.0



Re: [v2 PATCH 0/3] crypto: skcipher - Add support for no chaining and partial chaining

2020-06-15 Thread Ard Biesheuvel
On Mon, 15 Jun 2020 at 20:50, Eric Biggers  wrote:
>
> On Mon, Jun 15, 2020 at 09:50:50AM +0200, Ard Biesheuvel wrote:
> > On Mon, 15 Jun 2020 at 09:30, Herbert Xu  
> > wrote:
> > >
> > > On Fri, Jun 12, 2020 at 06:10:57PM +0200, Ard Biesheuvel wrote:
> > > >
> > > > First of all, the default fcsize for all existing XTS implementations
> > > > should be -1 as well, given that chaining is currently not supported
> > > > at all at the sckipher interface layer for any of them (due to the
> > > > fact that the IV gets encrypted with a different key at the start of
> > >
> > > Sure.  I was just too lazy to actually set the -1 everywhere.  I'll
> > > try to do that before I repost again.
> > >
> >
> > Fair enough
> >
> > > > the operation). This also means it is going to be rather tricky to
> > > > implement for h/w accelerated XTS implementations, and it seems to me
> > > > that the only way to deal with this is to decrypt the IV in software
> > > > before chaining the next operation, which is rather horrid and needs
> > > > to be implemented by all of them.
> > >
> > > I don't think we should support chaining for XTS at all so I don't
> > > see why we need to worry about the hardware accelerated XTS code.
> > >
> >
> > I would prefer that. But if it is fine to disallow chaining altogether
> > for XTS, why can't we do the same for cbc-cts? In both cases, user
> > space cannot be relying on it today, since the output is incorrect,
> > even for inputs that are a round multiple of the block size but are
> > broken up and chained.
> >
> > > > Given that
> > > >
> > > > a) this is wholly an AF_ALG issue, as there are no in-kernel users
> > > > currently suffering from this afaik,
> > > > b) using AF_ALG to get access to software implementations is rather
> > > > pointless in general, given that userspace can simply issue the same
> > > > instructions directly
> > > > c) fixing all XTS and CTS implementation on all arches and all
> > > > accelerators is not a small task
> > > >
> > > > wouldn't it be better to special case XTS and CBC-CTS in
> > > > algif_skcipher instead, rather than polluting the skipcher API this
> > > > way?
> > >
> > > As I said we need to be able to differentiate between the ones
> > > that can chain vs. the ones that can't.  Putting this knowledge
> > > directly into algif_skcipher is just too horrid.
> > >
> >
> > No disagreement on the horrid. But polluting the API for an issue that
> > only affects AF_ALG, which can't possibly be working as expected right
> > now is not a great thing either.
> >
> > > The alternative is to add this marker into the algorithms.  My
> > > point was that if you're going to do that you might as well go
> > > a step further and allow cts to chain as it is so straightforward.
> > >
> >
> > Given the fact that algos that require chaining are broken today and
> > nobody noticed until Stephan started relying on the skcipher request
> > object's IV field magically retaining its value on subsequent reuse, I
> > would prefer it if we could simply mark all of them as non-chainable
> > and be done with it. (Note that Stephan's case was invalid to begin
> > with)
>
> Wouldn't it make a lot more sense to make skcipher algorithms non-chainable by
> default, and only opt-in the ones where chaining is actually working?  At the
> moment we only test iv_out for CBC and CTR, so we can expect that all the 
> others
> are broken.
>

Agreed. But there is a difference, though. XTS and CBC-CTS are
guaranteed not to be used in a chaining manner today, given that there
is no possible way you could get the right output for a AF_ALG request
that has been split into several skcipher requests: XTS has the IV
encryption that occurs only once, and CBC-CTS has the unconditional
swapping of the last two blocks, which occurs even if the output is a
whole multiple of the block size. Doing either of these more than once
will necessarily result in corrupt output.

For cases where chaining is more straight-forward, we may have users
that we are unaware of, so it is trickier. But that only means that we
may need to require iv_out support for other modes than CBC and CTR,
not that we need to add complexity like this series is doing.

For now, I would prefer to simply introduce a 'permits chaining' flag
that only gets set for CBC

Re: [v2 PATCH 0/3] crypto: skcipher - Add support for no chaining and partial chaining

2020-06-15 Thread Ard Biesheuvel
On Mon, 15 Jun 2020 at 09:30, Herbert Xu  wrote:
>
> On Fri, Jun 12, 2020 at 06:10:57PM +0200, Ard Biesheuvel wrote:
> >
> > First of all, the default fcsize for all existing XTS implementations
> > should be -1 as well, given that chaining is currently not supported
> > at all at the sckipher interface layer for any of them (due to the
> > fact that the IV gets encrypted with a different key at the start of
>
> Sure.  I was just too lazy to actually set the -1 everywhere.  I'll
> try to do that before I repost again.
>

Fair enough

> > the operation). This also means it is going to be rather tricky to
> > implement for h/w accelerated XTS implementations, and it seems to me
> > that the only way to deal with this is to decrypt the IV in software
> > before chaining the next operation, which is rather horrid and needs
> > to be implemented by all of them.
>
> I don't think we should support chaining for XTS at all so I don't
> see why we need to worry about the hardware accelerated XTS code.
>

I would prefer that. But if it is fine to disallow chaining altogether
for XTS, why can't we do the same for cbc-cts? In both cases, user
space cannot be relying on it today, since the output is incorrect,
even for inputs that are a round multiple of the block size but are
broken up and chained.

> > Given that
> >
> > a) this is wholly an AF_ALG issue, as there are no in-kernel users
> > currently suffering from this afaik,
> > b) using AF_ALG to get access to software implementations is rather
> > pointless in general, given that userspace can simply issue the same
> > instructions directly
> > c) fixing all XTS and CTS implementation on all arches and all
> > accelerators is not a small task
> >
> > wouldn't it be better to special case XTS and CBC-CTS in
> > algif_skcipher instead, rather than polluting the skipcher API this
> > way?
>
> As I said we need to be able to differentiate between the ones
> that can chain vs. the ones that can't.  Putting this knowledge
> directly into algif_skcipher is just too horrid.
>

No disagreement on the horrid. But polluting the API for an issue that
only affects AF_ALG, which can't possibly be working as expected right
now is not a great thing either.

> The alternative is to add this marker into the algorithms.  My
> point was that if you're going to do that you might as well go
> a step further and allow cts to chain as it is so straightforward.
>

Given the fact that algos that require chaining are broken today and
nobody noticed until Stephan started relying on the skcipher request
object's IV field magically retaining its value on subsequent reuse, I
would prefer it if we could simply mark all of them as non-chainable
and be done with it. (Note that Stephan's case was invalid to begin
with)


Re: [v2 PATCH 0/3] crypto: skcipher - Add support for no chaining and partial chaining

2020-06-12 Thread Ard Biesheuvel
On Fri, 12 Jun 2020 at 14:21, Herbert Xu  wrote:
>
> v2
>
> Fixed return type of crypto_skcipher_fcsize.
>
> --
>
> This patch-set adds support to the Crypto API and algif_skcipher
> for algorithms that cannot be chained, as well as ones that can
> be chained if you withhold a certain number of blocks at the end.
>
> It only modifies one algorithm to utilise this, namely cts-generic.
> Changing others should be fairly straightforward.  In particular,
> we should mark all the ones that don't support chaining (e.g., most
> stream ciphers).
>

I understand that there is an oversight here that we need to address,
but I am not crazy about this approach, tbh.

First of all, the default fcsize for all existing XTS implementations
should be -1 as well, given that chaining is currently not supported
at all at the sckipher interface layer for any of them (due to the
fact that the IV gets encrypted with a different key at the start of
the operation). This also means it is going to be rather tricky to
implement for h/w accelerated XTS implementations, and it seems to me
that the only way to deal with this is to decrypt the IV in software
before chaining the next operation, which is rather horrid and needs
to be implemented by all of them.

Given that

a) this is wholly an AF_ALG issue, as there are no in-kernel users
currently suffering from this afaik,
b) using AF_ALG to get access to software implementations is rather
pointless in general, given that userspace can simply issue the same
instructions directly
c) fixing all XTS and CTS implementation on all arches and all
accelerators is not a small task

wouldn't it be better to special case XTS and CBC-CTS in
algif_skcipher instead, rather than polluting the skipcher API this
way?


Re: Security Random Number Generator support

2020-06-02 Thread Ard Biesheuvel
On Tue, 2 Jun 2020 at 10:15, Neal Liu  wrote:
>
> These patch series introduce a security random number generator
> which provides a generic interface to get hardware rnd from Secure
> state. The Secure state can be Arm Trusted Firmware(ATF), Trusted
> Execution Environment(TEE), or even EL2 hypervisor.
>
> Patch #1..2 adds sec-rng kernel driver for Trustzone based SoCs.
> For security awareness SoCs on ARMv8 with TrustZone enabled,
> peripherals like entropy sources is not accessible from normal world
> (linux) and rather accessible from secure world (HYP/ATF/TEE) only.
> This driver aims to provide a generic interface to Arm Trusted
> Firmware or Hypervisor rng service.
>
>
> changes since v1:
> - rename mt67xx-rng to mtk-sec-rng since all MediaTek ARMv8 SoCs can reuse
>   this driver.
>   - refine coding style and unnecessary check.
>
>   changes since v2:
>   - remove unused comments.
>   - remove redundant variable.
>
>   changes since v3:
>   - add dt-bindings for MediaTek rng with TrustZone enabled.
>   - revise HWRNG SMC call fid.
>
>   changes since v4:
>   - move bindings to the arm/firmware directory.
>   - revise driver init flow to check more property.
>
>   changes since v5:
>   - refactor to more generic security rng driver which
> is not platform specific.
>
> *** BLURB HERE ***
>
> Neal Liu (2):
>   dt-bindings: rng: add bindings for sec-rng
>   hwrng: add sec-rng driver
>

There is no reason to model a SMC call as a driver, and represent it
via a DT node like this.

It would be much better if this SMC interface is made truly generic,
and wired into the arch_get_random() interface, which can be used much
earlier.


>  .../devicetree/bindings/rng/sec-rng.yaml  |  53 ++
>  drivers/char/hw_random/Kconfig|  13 ++
>  drivers/char/hw_random/Makefile   |   1 +
>  drivers/char/hw_random/sec-rng.c  | 155 ++
>  4 files changed, 222 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/rng/sec-rng.yaml
>  create mode 100644 drivers/char/hw_random/sec-rng.c
>
> --
> 2.18.0


Re: [RFC/RFT PATCH 0/2] crypto: add CTS output IVs for arm64 and testmgr

2020-05-29 Thread Ard Biesheuvel
On Fri, 29 May 2020 at 15:19, Herbert Xu  wrote:
>
> On Fri, May 29, 2020 at 03:10:43PM +0200, Ard Biesheuvel wrote:
> >
> > OK, so the undocumented assumption is that algif_skcipher requests are
> > delineated by ALG_SET_IV commands, and that anything that gets sent to
> > the socket in between should be treated as a single request, right? I
>
> Correct.
>

So what about the final request? At which point do you decide to
return the final chunk of data that you have been holding back in
order to ensure that you can perform the final processing correctly if
it is not being followed by a ALG_SET_IV command?


Re: [RFC/RFT PATCH 0/2] crypto: add CTS output IVs for arm64 and testmgr

2020-05-29 Thread Ard Biesheuvel
On Fri, 29 May 2020 at 14:02, Herbert Xu  wrote:
>
> On Fri, May 29, 2020 at 02:00:14PM +0200, Ard Biesheuvel wrote:
> >
> > Even if this is the case, it requires that an skcipher implementation
> > stores an output IV in the buffer that skcipher request's IV field
> > points to. Currently, we only check whether this is the case for CBC
> > implementations, and so it is quite likely that lots of h/w
> > accelerators or arch code don't adhere to this today.
>
> They are and have always been broken because algif_skcipher has
> always relied on this.
>

OK, so the undocumented assumption is that algif_skcipher requests are
delineated by ALG_SET_IV commands, and that anything that gets sent to
the socket in between should be treated as a single request, right? I
think that makes sense, but do note that this deviates from Stephan's
use case, where the ciphertext stealing block swap was applied after
every call into af_alg, with the IV being inherited from one request
to the next. I think that case was invalid to begin with, I just hope
no other use cases exist where this unspecified behavior is being
relied upon.

> > This might be feasible for the generic CTS driver wrapping h/w
> > accelerated CBC. But how is this supposed to work, e.g., for the two
> > existing h/w implementations of cts(cbc(aes)) that currently ignore
> > this?
>
> They'll have to disable chaining.
>
> The way I'm doing this would allow some implementations to allow
> chaining while others of the same algorithm can disable chaining
> and require the whole request to be presented together.
>

Fair enough.


Re: [RFC/RFT PATCH 0/2] crypto: add CTS output IVs for arm64 and testmgr

2020-05-29 Thread Ard Biesheuvel
On Fri, 29 May 2020 at 13:51, Herbert Xu  wrote:
>
> On Fri, May 29, 2020 at 10:20:27AM +0200, Ard Biesheuvel wrote:
> >
> > But many implementation do not return an output IV at all. The only
> > mode that requires it (for the selftests to pass) is CBC.
>
> Most modes can be chained, e.g., CBC, PCBC, OFB, CFB and CTR.
> As it stands algif_skcipher requres all algorithms to support
> chaining.
>

Even if this is the case, it requires that an skcipher implementation
stores an output IV in the buffer that skcipher request's IV field
points to. Currently, we only check whether this is the case for CBC
implementations, and so it is quite likely that lots of h/w
accelerators or arch code don't adhere to this today.


> > For XTS, we would have to carry some metadata around that tells you
> > whether the initial encryption of the IV has occurred or not. In the
>
> You're right, XTS in its current form cannot be chained.  So we
> do need a way to mark that for algif_skcipher.
>
> > CTS case, you need two swap the last two blocks of ciphertext at the
> > very end.
>
> CTS can be easily chained.  You just need to always keep two blocks
> from being processed until you reach the end.
>

This might be feasible for the generic CTS driver wrapping h/w
accelerated CBC. But how is this supposed to work, e.g., for the two
existing h/w implementations of cts(cbc(aes)) that currently ignore
this?


Re: [RFC/RFT PATCH 0/2] crypto: add CTS output IVs for arm64 and testmgr

2020-05-29 Thread Ard Biesheuvel
On Fri, 29 May 2020 at 10:05, Herbert Xu  wrote:
>
> On Thu, May 28, 2020 at 10:33:25AM +0200, Ard Biesheuvel wrote:
> >
> > The reason we return output IVs for CBC is because our generic
> > implementation of CTS can wrap any CBC implementation, and relies on
> > this output IV rather than grabbing it from the ciphertext directly
> > (which may be tricky and is best left up to the CBC code)
>
> No that's not the main reason.  The main user of chaining is
> algif_skcipher.
>

But many implementation do not return an output IV at all. The only
mode that requires it (for the selftests to pass) is CBC.

> > So if you are saying that we should never split up algif_skcipher
> > requests into multiple calls into the underlying skcipher, then I
> > agree with you. Even if the generic CTS seems to work more or less as
> > expected by, e.g., the NIST validation tool, this is unspecified
> > behavior, and definitely broken in various other places.
>
> I was merely suggesting that requests to CTS/XTS shouldn't be
> split up.  Doing it for others would be a serious regression.
>

Given that in these cases, doing so will give incorrect results even
if the input size is a whole multiple of the block size, I agree that
adding this restriction will not break anything that is working
consistently at the moment.

But could you elaborate on the serious regression for other cases? Do
you have anything particular in mind?

> However, having looked at this it would seem that the effort
> in marking CTS/XTS is not that different to just adding support
> to hold the last two blocks of data so that CTS/XTS can support
> chaining.
>

For XTS, we would have to carry some metadata around that tells you
whether the initial encryption of the IV has occurred or not. In the
CTS case, you need two swap the last two blocks of ciphertext at the
very end.

So does that mean some kind of init/update/final model for skcipher? I
can see how that could address these issues (init() would encrypt the
IV for XTS, and final() would do the final block handling for CTS).
Just holding two blocks of data does not seem sufficient to me to
handle these issues.


Re: [RFC/RFT PATCH 0/2] crypto: add CTS output IVs for arm64 and testmgr

2020-05-28 Thread Ard Biesheuvel
On Thu, 28 May 2020 at 09:33, Herbert Xu  wrote:
>
> Ard Biesheuvel  wrote:
> > Stephan reports that the arm64 implementation of cts(cbc(aes)) deviates
> > from the generic implementation in what it returns as the output IV. So
> > fix this, and add some test vectors to catch other non-compliant
> > implementations.
> >
> > Stephan, could you provide a reference for the NIST validation tool and
> > how it flags this behaviour as non-compliant? Thanks.
>
> I think our CTS and XTS are both broken with respect to af_alg.
>
> The reason we use output IVs in general is to support chaining
> which is required by algif_skcipher to break up large requests
> into smaller ones.
>
> For CTS and XTS that simply doesn't work.  So we should fix this
> by changing algif_skcipher to not do chaining (and hence drop
> support for large requests like algif_aead) for algorithms like
> CTS/XTS.
>

The reason we return output IVs for CBC is because our generic
implementation of CTS can wrap any CBC implementation, and relies on
this output IV rather than grabbing it from the ciphertext directly
(which may be tricky and is best left up to the CBC code)

For CTS itself, as well as XTS, returning an output IV is meaningless,
given that
a) the implementations rely on the skcipher_walk infrastructure to
present all input except the last bit in chunks that are a multiple of
the block size,
b) neither specification defines how chaining of inputs should work,
regardless of whether the preceding input was a multiple of the block
size or not.

The CS3 mode that we implement for CTS swaps the final two blocks
unconditionally. So even if the input is a whole multiple of the block
size, the preceding ciphertext will turn out differently if any output
happens to follow.

For XTS, the IV is encrypted before processing the first block, so
even if you do return an output IV, the subsequent invocations of the
skcipher need to omit the encryption, which we don't implement
currently.

So if you are saying that we should never split up algif_skcipher
requests into multiple calls into the underlying skcipher, then I
agree with you. Even if the generic CTS seems to work more or less as
expected by, e.g., the NIST validation tool, this is unspecified
behavior, and definitely broken in various other places.


Re: [PATCH 5/5] crypto: stm32/crc: protect from concurrent accesses

2020-05-25 Thread Ard Biesheuvel
On Mon, 25 May 2020 at 13:49, Nicolas TOROMANOFF
 wrote:
>
> > -Original Message-
> > From: Ard Biesheuvel 
> > Sent: Monday, May 25, 2020 11:07 AM
> > To: Nicolas TOROMANOFF ; Eric Biggers
> > 
> > On Mon, 25 May 2020 at 11:01, Nicolas TOROMANOFF
> >  wrote:
> > >
> > > > -Original Message-
> > > > From: Ard Biesheuvel 
> > > > Sent: Monday, May 25, 2020 9:46 AM
> > > > To: Nicolas TOROMANOFF 
> > > > Subject: Re: [PATCH 5/5] crypto: stm32/crc: protect from concurrent
> > > > accesses
> > > >
> > > > On Mon, 25 May 2020 at 09:24, Nicolas TOROMANOFF
> > > >  wrote:
> > > > >
> > > > > Hello,
> > > > >
> > > > > > -Original Message-
> > > > > > From: Ard Biesheuvel 
> > > > > > Sent: Friday, May 22, 2020 6:12 PM> On Tue, 12 May 2020 at
> > > > > > 16:13, Nicolas Toromanoff  wrote:
> > > > > > >
> > > > > > > Protect STM32 CRC device from concurrent accesses.
> > > > > > >
> > > > > > > As we create a spinlocked section that increase with buffer
> > > > > > > size, we provide a module parameter to release the pressure by
> > > > > > > splitting critical section in chunks.
> > > > > > >
> > > > > > > Size of each chunk is defined in burst_size module parameter.
> > > > > > > By default burst_size=0, i.e. don't split incoming buffer.
> > > > > > >
> > > > > > > Signed-off-by: Nicolas Toromanoff 
> > > > > >
> > > > > > Would you mind explaining the usage model here? It looks like
> > > > > > you are sharing a CRC hardware accelerator with a synchronous
> > > > > > interface between different users by using spinlocks? You are
> > > > > > aware that this will tie up the waiting CPUs completely during
> > > > > > this time, right? So it would be much better to use a mutex
> > > > > > here. Or perhaps it would make more sense to fall back to a s/w
> > > > > > based CRC routine if the h/w is tied up
> > > > working for another task?
> > > > >
> > > > > I know mutex are more acceptable here, but shash _update() and
> > > > > _init() may be call from any context, and so I cannot take a mutex.
> > > > > And to protect my concurrent HW access I only though about spinlock.
> > > > > Due to possible constraint on CPUs, I add a burst_size option to
> > > > > force slitting long buffer into smaller one, and so decrease time we 
> > > > > take
> > the lock.
> > > > > But I didn't though to fallback to software CRC.
> > > > >
> > > > > I'll do a patch on top.
> > > > > In in the burst_update() function I'll use a
> > > > > spin_trylock_irqsave() and use
> > > > software CRC32 if HW is already in use.
> > > > >
> > > >
> > > > Right. I didn't even notice that you were keeping interrupts
> > > > disabled the whole time when using the h/w block. That means that
> > > > any serious use of this h/w block will make IRQ latency go through the 
> > > > roof.
> > > >
> > > > I recommend that you go back to the drawing board on this driver,
> > > > rather than papering over the issues with a spin_trylock(). Perhaps
> > > > it would be better to model it as a ahash (even though the h/w block
> > > > itself is synchronous) and use a kthread to feed in the data.
> > >
> > > I thought when I updated the driver to move to a ahash interface, but
> > > the main usage of crc32 is the ext4 fs, that calls the shash API.
> > > Commit 877b5691f27a ("crypto: shash - remove shash_desc::flags")
> > > removed possibility to sleep in shash callback. (before this commit
> > > and with MAY_SLEEP option set, using a mutex may have been fine).
> > >
> >
> > According to that commit's log, sleeping is never fine for shash(), since 
> > it uses
> > kmap_atomic() when iterating over the scatterlist.
>
> Today, we could avoid using kmap_atomic() in shash_ashash_*() APIs (the
> ones that Walks through the scatterlist) by using the
> crypto_ahash_walk_first() function to initialize the shash_ahash walker
> (note 

Re: [PATCH 5/5] crypto: stm32/crc: protect from concurrent accesses

2020-05-25 Thread Ard Biesheuvel
(+ Eric)

On Mon, 25 May 2020 at 11:01, Nicolas TOROMANOFF
 wrote:
>
> > -Original Message-
> > From: Ard Biesheuvel 
> > Sent: Monday, May 25, 2020 9:46 AM
> > To: Nicolas TOROMANOFF 
> > Subject: Re: [PATCH 5/5] crypto: stm32/crc: protect from concurrent accesses
> >
> > On Mon, 25 May 2020 at 09:24, Nicolas TOROMANOFF
> >  wrote:
> > >
> > > Hello,
> > >
> > > > -Original Message-
> > > > From: Ard Biesheuvel 
> > > > Sent: Friday, May 22, 2020 6:12 PM>
> > > > On Tue, 12 May 2020 at 16:13, Nicolas Toromanoff
> > > >  wrote:
> > > > >
> > > > > Protect STM32 CRC device from concurrent accesses.
> > > > >
> > > > > As we create a spinlocked section that increase with buffer size,
> > > > > we provide a module parameter to release the pressure by splitting
> > > > > critical section in chunks.
> > > > >
> > > > > Size of each chunk is defined in burst_size module parameter.
> > > > > By default burst_size=0, i.e. don't split incoming buffer.
> > > > >
> > > > > Signed-off-by: Nicolas Toromanoff 
> > > >
> > > > Would you mind explaining the usage model here? It looks like you
> > > > are sharing a CRC hardware accelerator with a synchronous interface
> > > > between different users by using spinlocks? You are aware that this
> > > > will tie up the waiting CPUs completely during this time, right? So
> > > > it would be much better to use a mutex here. Or perhaps it would
> > > > make more sense to fall back to a s/w based CRC routine if the h/w is 
> > > > tied up
> > working for another task?
> > >
> > > I know mutex are more acceptable here, but shash _update() and _init()
> > > may be call from any context, and so I cannot take a mutex.
> > > And to protect my concurrent HW access I only though about spinlock.
> > > Due to possible constraint on CPUs, I add a burst_size option to force
> > > slitting long buffer into smaller one, and so decrease time we take the 
> > > lock.
> > > But I didn't though to fallback to software CRC.
> > >
> > > I'll do a patch on top.
> > > In in the burst_update() function I'll use a spin_trylock_irqsave() and 
> > > use
> > software CRC32 if HW is already in use.
> > >
> >
> > Right. I didn't even notice that you were keeping interrupts disabled the 
> > whole
> > time when using the h/w block. That means that any serious use of this h/w
> > block will make IRQ latency go through the roof.
> >
> > I recommend that you go back to the drawing board on this driver, rather 
> > than
> > papering over the issues with a spin_trylock(). Perhaps it would be better 
> > to
> > model it as a ahash (even though the h/w block itself is synchronous) and 
> > use a
> > kthread to feed in the data.
>
> I thought when I updated the driver to move to a ahash interface, but the 
> main usage
> of crc32 is the ext4 fs, that calls the shash API.
> Commit 877b5691f27a ("crypto: shash - remove shash_desc::flags") removed 
> possibility
> to sleep in shash callback. (before this commit and with MAY_SLEEP option 
> set, using
> a mutex may have been fine).
>

According to that commit's log, sleeping is never fine for shash(),
since it uses kmap_atomic() when iterating over the scatterlist.

> By now the solution I see is to use the spin_trylock_irqsave(), fallback to 
> software crc *AND* capping burst_size
> to ensure the locked section stay reasonable.
>
> Does this seems acceptable ?
>

If the reason for disabling interrupts is to avoid deadlocks, wouldn't
the switch to trylock() with a software fallback allow us to keep
interrupts enabled?


Re: [PATCH 5/5] crypto: stm32/crc: protect from concurrent accesses

2020-05-25 Thread Ard Biesheuvel
On Mon, 25 May 2020 at 09:24, Nicolas TOROMANOFF
 wrote:
>
> Hello,
>
> > -Original Message-----
> > From: Ard Biesheuvel 
> > Sent: Friday, May 22, 2020 6:12 PM>
> > On Tue, 12 May 2020 at 16:13, Nicolas Toromanoff
> >  wrote:
> > >
> > > Protect STM32 CRC device from concurrent accesses.
> > >
> > > As we create a spinlocked section that increase with buffer size, we
> > > provide a module parameter to release the pressure by splitting
> > > critical section in chunks.
> > >
> > > Size of each chunk is defined in burst_size module parameter.
> > > By default burst_size=0, i.e. don't split incoming buffer.
> > >
> > > Signed-off-by: Nicolas Toromanoff 
> >
> > Would you mind explaining the usage model here? It looks like you are 
> > sharing a
> > CRC hardware accelerator with a synchronous interface between different 
> > users
> > by using spinlocks? You are aware that this will tie up the waiting CPUs
> > completely during this time, right? So it would be much better to use a 
> > mutex
> > here. Or perhaps it would make more sense to fall back to a s/w based CRC
> > routine if the h/w is tied up working for another task?
>
> I know mutex are more acceptable here, but shash _update() and _init() may be 
> call
> from any context, and so I cannot take a mutex.
> And to protect my concurrent HW access I only though about spinlock. Due to 
> possible
> constraint on CPUs, I add a burst_size option to force slitting long buffer 
> into smaller one,
> and so decrease time we take the lock.
> But I didn't though to fallback to software CRC.
>
> I'll do a patch on top.
> In in the burst_update() function I'll use a spin_trylock_irqsave() and use 
> software CRC32 if HW is already in use.
>

Right. I didn't even notice that you were keeping interrupts disabled
the whole time when using the h/w block. That means that any serious
use of this h/w block will make IRQ latency go through the roof.

I recommend that you go back to the drawing board on this driver,
rather than papering over the issues with a spin_trylock(). Perhaps it
would be better to model it as a ahash (even though the h/w block
itself is synchronous) and use a kthread to feed in the data.


Re: [RFC/RFT PATCH 0/2] crypto: add CTS output IVs for arm64 and testmgr

2020-05-23 Thread Ard Biesheuvel
On Sat, 23 May 2020 at 20:52, Stephan Müller  wrote:
>
> Am Donnerstag, 21. Mai 2020, 15:23:41 CEST schrieb Ard Biesheuvel:
>
> Hi Ard,
>
> > On Thu, 21 May 2020 at 15:01, Gilad Ben-Yossef  wrote:
> > > Hi Ard,
> > >
> > > Thank you for looping me in.
> > >
> > > On Wed, May 20, 2020 at 10:09 AM Ard Biesheuvel  wrote:
> > > > On Wed, 20 May 2020 at 09:01, Stephan Mueller 
> wrote:
> > > > > Am Mittwoch, 20. Mai 2020, 08:54:10 CEST schrieb Ard Biesheuvel:
> > > > >
> > > > > Hi Ard,
> > > > >
> > > > > > On Wed, 20 May 2020 at 08:47, Stephan Mueller 
> wrote:
> > > > ...
> > > >
> > > > > > > The state of all block chaining modes we currently have is defined
> > > > > > > with
> > > > > > > the
> > > > > > > IV. That is the reason why I mentioned it can be implemented
> > > > > > > stateless
> > > > > > > when I am able to get the IV output from the previous operation.
> > > > > >
> > > > > > But it is simply the same as the penultimate block of ciphertext. So
> > > > > > you can simply capture it after encrypt, or before decrypt. There is
> > > > > > really no need to rely on the CTS transformation to pass it back to
> > > > > > you via the buffer that is only specified to provide an input to the
> > > > > > CTS transform.
> > > > >
> > > > > Let me recheck that as I am not fully sure on that one. But if it can
> > > > > be
> > > > > handled that way, it would make life easier.
> > > >
> > > > Please refer to patch 2. The .iv_out test vectors were all simply
> > > > copied from the appropriate offset into the associated .ctext member.
> > >
> > > Not surprisingly since to the best of my understanding this behaviour
> > > is not strictly specified, ccree currently fails the IV output check
> > > with the 2nd version of the patch.
> >
> > That is what I suspected, hence the cc:
> > > If I understand you correctly, the expected output IV is simply the
> > > next to last block of the ciphertext?
> >
> > Yes. But this happens to work for the generic case because the CTS
> > driver itself requires the encapsulated CBC mode to return the output
> > IV, which is simply passed through back to the caller. CTS mode itself
> > does not specify any kind of output IV, so we should not rely on this
> > behavior.
>
> Note, the update to the spec based on your suggestion is already in a merge
> request:
>
> https://github.com/usnistgov/ACVP/issues/860
>
> Thanks for your input.
>

Thanks for the head's up. I've left a comment there, as the proposed
change is not equivalent to the unspecified current behavior.


Re: Monte Carlo Test (MCT) for AES

2020-05-22 Thread Ard Biesheuvel
(+ Stephan)

On Fri, 22 May 2020 at 05:20, Bhat, Jayalakshmi Manjunath
 wrote:
>
> Hi All,
>
> We are using libkcapi for CAVS vectors verification on our Linux kernel. Our 
> Linux kernel version is 4.14.  Monte Carlo Test (MCT) for SHA worked fine 
> using libkcapi. We are trying to perform Monte Carlo Test (MCT) for AES using 
> libkcapi.
> We not able to get the result successfully. Is it possible to use libkcapi to 
> achieve AES MCT?
>
> Regards,
> Jayalakshmi
>


Re: [PATCH 5/5] crypto: stm32/crc: protect from concurrent accesses

2020-05-22 Thread Ard Biesheuvel
On Tue, 12 May 2020 at 16:13, Nicolas Toromanoff
 wrote:
>
> Protect STM32 CRC device from concurrent accesses.
>
> As we create a spinlocked section that increase with buffer size,
> we provide a module parameter to release the pressure by splitting
> critical section in chunks.
>
> Size of each chunk is defined in burst_size module parameter.
> By default burst_size=0, i.e. don't split incoming buffer.
>
> Signed-off-by: Nicolas Toromanoff 

Would you mind explaining the usage model here? It looks like you are
sharing a CRC hardware accelerator with a synchronous interface
between different users by using spinlocks? You are aware that this
will tie up the waiting CPUs completely during this time, right? So it
would be much better to use a mutex here. Or perhaps it would make
more sense to fall back to a s/w based CRC routine if the h/w is tied
up working for another task?

Using spinlocks for this is really not acceptable.



> ---
>  drivers/crypto/stm32/stm32-crc32.c | 47 --
>  1 file changed, 45 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/crypto/stm32/stm32-crc32.c 
> b/drivers/crypto/stm32/stm32-crc32.c
> index 413415c216ef..3ba41148c2a4 100644
> --- a/drivers/crypto/stm32/stm32-crc32.c
> +++ b/drivers/crypto/stm32/stm32-crc32.c
> @@ -35,11 +35,16 @@
>
>  #define CRC_AUTOSUSPEND_DELAY  50
>
> +static unsigned int burst_size;
> +module_param(burst_size, uint, 0644);
> +MODULE_PARM_DESC(burst_size, "Select burst byte size (0 unlimited)");
> +
>  struct stm32_crc {
> struct list_head list;
> struct device*dev;
> void __iomem *regs;
> struct clk   *clk;
> +   spinlock_t   lock;
>  };
>
>  struct stm32_crc_list {
> @@ -109,6 +114,7 @@ static int stm32_crc_init(struct shash_desc *desc)
> struct stm32_crc_desc_ctx *ctx = shash_desc_ctx(desc);
> struct stm32_crc_ctx *mctx = crypto_shash_ctx(desc->tfm);
> struct stm32_crc *crc;
> +   unsigned long flags;
>
> crc = stm32_crc_get_next_crc();
> if (!crc)
> @@ -116,6 +122,8 @@ static int stm32_crc_init(struct shash_desc *desc)
>
> pm_runtime_get_sync(crc->dev);
>
> +   spin_lock_irqsave(&crc->lock, flags);
> +
> /* Reset, set key, poly and configure in bit reverse mode */
> writel_relaxed(bitrev32(mctx->key), crc->regs + CRC_INIT);
> writel_relaxed(bitrev32(mctx->poly), crc->regs + CRC_POL);
> @@ -125,18 +133,21 @@ static int stm32_crc_init(struct shash_desc *desc)
> /* Store partial result */
> ctx->partial = readl_relaxed(crc->regs + CRC_DR);
>
> +   spin_unlock_irqrestore(&crc->lock, flags);
> +
> pm_runtime_mark_last_busy(crc->dev);
> pm_runtime_put_autosuspend(crc->dev);
>
> return 0;
>  }
>
> -static int stm32_crc_update(struct shash_desc *desc, const u8 *d8,
> -   unsigned int length)
> +static int burst_update(struct shash_desc *desc, const u8 *d8,
> +   size_t length)
>  {
> struct stm32_crc_desc_ctx *ctx = shash_desc_ctx(desc);
> struct stm32_crc_ctx *mctx = crypto_shash_ctx(desc->tfm);
> struct stm32_crc *crc;
> +   unsigned long flags;
>
> crc = stm32_crc_get_next_crc();
> if (!crc)
> @@ -144,6 +155,8 @@ static int stm32_crc_update(struct shash_desc *desc, 
> const u8 *d8,
>
> pm_runtime_get_sync(crc->dev);
>
> +   spin_lock_irqsave(&crc->lock, flags);
> +
> /*
>  * Restore previously calculated CRC for this context as init value
>  * Restore polynomial configuration
> @@ -182,12 +195,40 @@ static int stm32_crc_update(struct shash_desc *desc, 
> const u8 *d8,
> /* Store partial result */
> ctx->partial = readl_relaxed(crc->regs + CRC_DR);
>
> +   spin_unlock_irqrestore(&crc->lock, flags);
> +
> pm_runtime_mark_last_busy(crc->dev);
> pm_runtime_put_autosuspend(crc->dev);
>
> return 0;
>  }
>
> +static int stm32_crc_update(struct shash_desc *desc, const u8 *d8,
> +   unsigned int length)
> +{
> +   const unsigned int burst_sz = burst_size;
> +   unsigned int rem_sz;
> +   const u8 *cur;
> +   size_t size;
> +   int ret;
> +
> +   if (!burst_sz)
> +   return burst_update(desc, d8, length);
> +
> +   /* Digest first bytes not 32bit aligned at first pass in the loop */
> +   size = min(length,
> +  burst_sz + (unsigned int)d8 - ALIGN_DOWN((unsigned int)d8,
> +   sizeof(u32)));
> +   for (rem_sz = length, cur = d8; rem_sz;
> +rem_sz -= size, cur += size, size = min(rem_sz, burst_sz)) {
> +   ret = burst_update(desc, cur, size);
> +   if (ret)
> +   return ret;
> +   }
> +
> +   return 0;
> +}
> +
>  static int stm32_crc_final(struct shash_desc *desc, u8 *out)
>  {
> str

Re: [RFC/RFT PATCH 0/2] crypto: add CTS output IVs for arm64 and testmgr

2020-05-21 Thread Ard Biesheuvel
On Thu, 21 May 2020 at 15:01, Gilad Ben-Yossef  wrote:
>
> Hi Ard,
>
> Thank you for looping me in.
>
> On Wed, May 20, 2020 at 10:09 AM Ard Biesheuvel  wrote:
> >
> > On Wed, 20 May 2020 at 09:01, Stephan Mueller  wrote:
> > >
> > > Am Mittwoch, 20. Mai 2020, 08:54:10 CEST schrieb Ard Biesheuvel:
> > >
> > > Hi Ard,
> > >
> > > > On Wed, 20 May 2020 at 08:47, Stephan Mueller  
> > > > wrote:
> > ...
> > > > > The state of all block chaining modes we currently have is defined 
> > > > > with
> > > > > the
> > > > > IV. That is the reason why I mentioned it can be implemented stateless
> > > > > when I am able to get the IV output from the previous operation.
> > > >
> > > > But it is simply the same as the penultimate block of ciphertext. So
> > > > you can simply capture it after encrypt, or before decrypt. There is
> > > > really no need to rely on the CTS transformation to pass it back to
> > > > you via the buffer that is only specified to provide an input to the
> > > > CTS transform.
> > >
> > > Let me recheck that as I am not fully sure on that one. But if it can be
> > > handled that way, it would make life easier.
> >
> > Please refer to patch 2. The .iv_out test vectors were all simply
> > copied from the appropriate offset into the associated .ctext member.
>
> Not surprisingly since to the best of my understanding this behaviour
> is not strictly specified, ccree currently fails the IV output check
> with the 2nd version of the patch.
>

That is what I suspected, hence the cc:

> If I understand you correctly, the expected output IV is simply the
> next to last block of the ciphertext?

Yes. But this happens to work for the generic case because the CTS
driver itself requires the encapsulated CBC mode to return the output
IV, which is simply passed through back to the caller. CTS mode itself
does not specify any kind of output IV, so we should not rely on this
behavior.


Re: [RFC/RFT PATCH 0/2] crypto: add CTS output IVs for arm64 and testmgr

2020-05-20 Thread Ard Biesheuvel
On Wed, 20 May 2020 at 09:01, Stephan Mueller  wrote:
>
> Am Mittwoch, 20. Mai 2020, 08:54:10 CEST schrieb Ard Biesheuvel:
>
> Hi Ard,
>
> > On Wed, 20 May 2020 at 08:47, Stephan Mueller  wrote:
...
> > > The state of all block chaining modes we currently have is defined with
> > > the
> > > IV. That is the reason why I mentioned it can be implemented stateless
> > > when I am able to get the IV output from the previous operation.
> >
> > But it is simply the same as the penultimate block of ciphertext. So
> > you can simply capture it after encrypt, or before decrypt. There is
> > really no need to rely on the CTS transformation to pass it back to
> > you via the buffer that is only specified to provide an input to the
> > CTS transform.
>
> Let me recheck that as I am not fully sure on that one. But if it can be
> handled that way, it would make life easier.

Please refer to patch 2. The .iv_out test vectors were all simply
copied from the appropriate offset into the associated .ctext member.


Re: [RFC/RFT PATCH 0/2] crypto: add CTS output IVs for arm64 and testmgr

2020-05-19 Thread Ard Biesheuvel
On Wed, 20 May 2020 at 08:47, Stephan Mueller  wrote:
>
> Am Mittwoch, 20. Mai 2020, 08:40:57 CEST schrieb Ard Biesheuvel:
>
> Hi Ard,
>
> > On Wed, 20 May 2020 at 08:03, Stephan Mueller  wrote:
> > > Am Dienstag, 19. Mai 2020, 21:02:09 CEST schrieb Ard Biesheuvel:
> > >
> > > Hi Ard,
> > >
> > > > Stephan reports that the arm64 implementation of cts(cbc(aes)) deviates
> > > > from the generic implementation in what it returns as the output IV. So
> > > > fix this, and add some test vectors to catch other non-compliant
> > > > implementations.
> > > >
> > > > Stephan, could you provide a reference for the NIST validation tool and
> > > > how it flags this behaviour as non-compliant? Thanks.
> > >
> > > The test definition that identified the inconsistent behavior is specified
> > > with [1]. Note, this testing is intended to become an RFC standard.
> >
> > Are you referring to the line
> >
> > CT[j] = AES_CBC_CS_ENCRYPT(Key[i], PT[j])
> >
> > where the CTS transform is invoked without an IV altogether?
>
> Precisely.
>
> > That
> > simply seems like a bug to me. In an abstract specification like this,
> > it would be insane for pseudocode functions to be stateful objects,
> > and there is nothing in the pseudocode that explicitly captures the
> > 'output IV' of that function call.
>
> I think the description may be updated by simply refer to IV[j-1]. Then you
> would not have a stateful operation, but you rest on the IV of the previous
> operation.
>

But that value is not the value you are using now, right? I suspect
that the line

IV[i+1] = MSB(CT[j], IV.length)

needs to be duplicated in the inner loop for j, although that would
require different versions for CS1/2/3


> The state of all block chaining modes we currently have is defined with the
> IV. That is the reason why I mentioned it can be implemented stateless when I
> am able to get the IV output from the previous operation.
>

But it is simply the same as the penultimate block of ciphertext. So
you can simply capture it after encrypt, or before decrypt. There is
really no need to rely on the CTS transformation to pass it back to
you via the buffer that is only specified to provide an input to the
CTS transform.


> >
> > > To facilitate that testing, NIST offers an internet service, the ACVP
> > > server, that allows obtaining test vectors and uploading responses. You
> > > see the large number of concluded testing with [2]. A particular
> > > completion of the CTS testing I finished yesterday is given in [3]. That
> > > particular testing was also performed on an ARM system with CE where the
> > > issue was detected.
> > >
> > > I am performing the testing with [4] that has an extension to test the
> > > kernel crypto API.
> >
> > OK. So given that that neither the CTS spec nor this document makes
> > any mention of an output IV or what its value should be, my suggestion
> > would be to capture the IV directly from the ciphertext, rather than
> > relying on some unspecified behavior to give you the right data. Note
> > that we have other implementations of cts(cbc(aes)) in the kernel as
> > well (h/w accelerated ones) and if there is no specification that
> > defines this behavior, you really shouldn't be relying on it.
>
> Agreed, but all I need is the IV from the previous round without relying on
> any state.

So just grab it from the ciphertext of the previous round.

> >
> >
> > That 'specification' invokes AES_CBC_CS_ENCRYPT() twice using a
> > different prototype, without any mention whatsoever what the implied
> > value of IV[] is if it is missing. This is especially problematic
> > given that it seems to cover all of CS1/2/3, and the relation between
> > next IV and ciphertext is not even the same between those for inputs
> > that are a multiple of the blocksize.
>
> I will relay that comment back to the authors for update.

Thanks.


> >
> > > [1]
> > > https://github.com/usnistgov/ACVP/blob/master/artifacts/draft-celi-acvp-b
> > > lock-ciph-00.txt#L366
> > >
> > > [2]
> > > https://csrc.nist.gov/projects/cryptographic-algorithm-validation-program
> > > / validation-search?searchMode=validation&family=1&productType=-1&ipp=25
> > >
> > > [3]
> > > https://csrc.nist.gov/projects/cryptographic-algorithm-validation-program
> > > / details?validation=32608
> > >
> > > [4] https://github.com/smuellerDD/acvpparser
> > >
> > > > Cc: Stephan Mueller 
> > > >
> > > > Ard Biesheuvel (2):
> > > >   crypto: arm64/aes - align output IV with generic CBC-CTS driver
> > > >   crypto: testmgr - add output IVs for AES-CBC with ciphertext stealing
> > > >
> > > >  arch/arm64/crypto/aes-modes.S |  2 ++
> > > >  crypto/testmgr.h  | 12 
> > > >  2 files changed, 14 insertions(+)
> > >
> > > Ciao
> > > Stephan
>
>
> Ciao
> Stephan
>
>


Re: [RFC/RFT PATCH 0/2] crypto: add CTS output IVs for arm64 and testmgr

2020-05-19 Thread Ard Biesheuvel
On Wed, 20 May 2020 at 08:03, Stephan Mueller  wrote:
>
> Am Dienstag, 19. Mai 2020, 21:02:09 CEST schrieb Ard Biesheuvel:
>
> Hi Ard,
>
> > Stephan reports that the arm64 implementation of cts(cbc(aes)) deviates
> > from the generic implementation in what it returns as the output IV. So
> > fix this, and add some test vectors to catch other non-compliant
> > implementations.
> >
> > Stephan, could you provide a reference for the NIST validation tool and
> > how it flags this behaviour as non-compliant? Thanks.
>
> The test definition that identified the inconsistent behavior is specified
> with [1]. Note, this testing is intended to become an RFC standard.
>

Are you referring to the line

CT[j] = AES_CBC_CS_ENCRYPT(Key[i], PT[j])

where the CTS transform is invoked without an IV altogether? That
simply seems like a bug to me. In an abstract specification like this,
it would be insane for pseudocode functions to be stateful objects,
and there is nothing in the pseudocode that explicitly captures the
'output IV' of that function call.


> To facilitate that testing, NIST offers an internet service, the ACVP server,
> that allows obtaining test vectors and uploading responses. You see the large
> number of concluded testing with [2]. A particular completion of the CTS
> testing I finished yesterday is given in [3]. That particular testing was also
> performed on an ARM system with CE where the issue was detected.
>
> I am performing the testing with [4] that has an extension to test the kernel
> crypto API.
>

OK. So given that that neither the CTS spec nor this document makes
any mention of an output IV or what its value should be, my suggestion
would be to capture the IV directly from the ciphertext, rather than
relying on some unspecified behavior to give you the right data. Note
that we have other implementations of cts(cbc(aes)) in the kernel as
well (h/w accelerated ones) and if there is no specification that
defines this behavior, you really shouldn't be relying on it.


That 'specification' invokes AES_CBC_CS_ENCRYPT() twice using a
different prototype, without any mention whatsoever what the implied
value of IV[] is if it is missing. This is especially problematic
given that it seems to cover all of CS1/2/3, and the relation between
next IV and ciphertext is not even the same between those for inputs
that are a multiple of the blocksize.


> [1] 
> https://github.com/usnistgov/ACVP/blob/master/artifacts/draft-celi-acvp-block-ciph-00.txt#L366
>
> [2] https://csrc.nist.gov/projects/cryptographic-algorithm-validation-program/
> validation-search?searchMode=validation&family=1&productType=-1&ipp=25
>
> [3] https://csrc.nist.gov/projects/cryptographic-algorithm-validation-program/
> details?validation=32608
>
> [4] https://github.com/smuellerDD/acvpparser
> >
> > Cc: Stephan Mueller 
> >
> > Ard Biesheuvel (2):
> >   crypto: arm64/aes - align output IV with generic CBC-CTS driver
> >   crypto: testmgr - add output IVs for AES-CBC with ciphertext stealing
> >
> >  arch/arm64/crypto/aes-modes.S |  2 ++
> >  crypto/testmgr.h  | 12 
> >  2 files changed, 14 insertions(+)
>
>
> Ciao
> Stephan
>
>


Re: [RFC/RFT PATCH 0/2] crypto: add CTS output IVs for arm64 and testmgr

2020-05-19 Thread Ard Biesheuvel
(add Gilad for cc-ree)

On Tue, 19 May 2020 at 21:02, Ard Biesheuvel  wrote:
>
> Stephan reports that the arm64 implementation of cts(cbc(aes)) deviates
> from the generic implementation in what it returns as the output IV. So
> fix this, and add some test vectors to catch other non-compliant
> implementations.
>
> Stephan, could you provide a reference for the NIST validation tool and
> how it flags this behaviour as non-compliant? Thanks.
>
> Cc: Stephan Mueller 
>
> Ard Biesheuvel (2):
>   crypto: arm64/aes - align output IV with generic CBC-CTS driver
>   crypto: testmgr - add output IVs for AES-CBC with ciphertext stealing
>
>  arch/arm64/crypto/aes-modes.S |  2 ++
>  crypto/testmgr.h  | 12 
>  2 files changed, 14 insertions(+)
>
> --
> 2.20.1
>


[RFC/RFT PATCH 2/2] crypto: testmgr - add output IVs for AES-CBC with ciphertext stealing

2020-05-19 Thread Ard Biesheuvel
Add some test vectors to get coverage for the IV that is output by CTS
implementations.

Signed-off-by: Ard Biesheuvel 
---
 crypto/testmgr.h | 12 
 1 file changed, 12 insertions(+)

diff --git a/crypto/testmgr.h b/crypto/testmgr.h
index d29983908c38..d45fa1ad91ee 100644
--- a/crypto/testmgr.h
+++ b/crypto/testmgr.h
@@ -31041,6 +31041,8 @@ static const struct cipher_testvec 
cts_mode_tv_template[] = {
.ctext  = "\xc6\x35\x35\x68\xf2\xbf\x8c\xb4"
  "\xd8\xa5\x80\x36\x2d\xa7\xff\x7f"
  "\x97",
+   .iv_out = "\xc6\x35\x35\x68\xf2\xbf\x8c\xb4"
+ "\xd8\xa5\x80\x36\x2d\xa7\xff\x7f",
}, {
.klen   = 16,
.key= "\x63\x68\x69\x63\x6b\x65\x6e\x20"
@@ -31054,6 +31056,8 @@ static const struct cipher_testvec 
cts_mode_tv_template[] = {
  "\xd4\x45\xd4\xc8\xef\xf7\xed\x22"
  "\x97\x68\x72\x68\xd6\xec\xcc\xc0"
  "\xc0\x7b\x25\xe2\x5e\xcf\xe5",
+   .iv_out = "\xfc\x00\x78\x3e\x0e\xfd\xb2\xc1"
+ "\xd4\x45\xd4\xc8\xef\xf7\xed\x22",
}, {
.klen   = 16,
.key= "\x63\x68\x69\x63\x6b\x65\x6e\x20"
@@ -31067,6 +31071,8 @@ static const struct cipher_testvec 
cts_mode_tv_template[] = {
  "\xbe\x7f\xcb\xcc\x98\xeb\xf5\xa8"
  "\x97\x68\x72\x68\xd6\xec\xcc\xc0"
  "\xc0\x7b\x25\xe2\x5e\xcf\xe5\x84",
+   .iv_out = "\x39\x31\x25\x23\xa7\x86\x62\xd5"
+ "\xbe\x7f\xcb\xcc\x98\xeb\xf5\xa8",
}, {
.klen   = 16,
.key= "\x63\x68\x69\x63\x6b\x65\x6e\x20"
@@ -31084,6 +31090,8 @@ static const struct cipher_testvec 
cts_mode_tv_template[] = {
  "\x1b\x55\x49\xd2\xf8\x38\x02\x9e"
  "\x39\x31\x25\x23\xa7\x86\x62\xd5"
  "\xbe\x7f\xcb\xcc\x98\xeb\xf5",
+   .iv_out = "\xb3\xff\xfd\x94\x0c\x16\xa1\x8c"
+ "\x1b\x55\x49\xd2\xf8\x38\x02\x9e",
}, {
.klen   = 16,
.key= "\x63\x68\x69\x63\x6b\x65\x6e\x20"
@@ -31101,6 +31109,8 @@ static const struct cipher_testvec 
cts_mode_tv_template[] = {
  "\x3b\xc1\x03\xe1\xa1\x94\xbb\xd8"
  "\x39\x31\x25\x23\xa7\x86\x62\xd5"
  "\xbe\x7f\xcb\xcc\x98\xeb\xf5\xa8",
+   .iv_out = "\x9d\xad\x8b\xbb\x96\xc4\xcd\xc0"
+ "\x3b\xc1\x03\xe1\xa1\x94\xbb\xd8",
}, {
.klen   = 16,
.key= "\x63\x68\x69\x63\x6b\x65\x6e\x20"
@@ -31122,6 +31132,8 @@ static const struct cipher_testvec 
cts_mode_tv_template[] = {
  "\x26\x73\x0d\xbc\x2f\x7b\xc8\x40"
  "\x9d\xad\x8b\xbb\x96\xc4\xcd\xc0"
  "\x3b\xc1\x03\xe1\xa1\x94\xbb\xd8",
+   .iv_out = "\x48\x07\xef\xe8\x36\xee\x89\xa5"
+ "\x26\x73\x0d\xbc\x2f\x7b\xc8\x40",
}
 };
 
-- 
2.20.1



[RFC/RFT PATCH 0/2] crypto: add CTS output IVs for arm64 and testmgr

2020-05-19 Thread Ard Biesheuvel
Stephan reports that the arm64 implementation of cts(cbc(aes)) deviates
from the generic implementation in what it returns as the output IV. So
fix this, and add some test vectors to catch other non-compliant
implementations.

Stephan, could you provide a reference for the NIST validation tool and
how it flags this behaviour as non-compliant? Thanks.

Cc: Stephan Mueller 

Ard Biesheuvel (2):
  crypto: arm64/aes - align output IV with generic CBC-CTS driver
  crypto: testmgr - add output IVs for AES-CBC with ciphertext stealing

 arch/arm64/crypto/aes-modes.S |  2 ++
 crypto/testmgr.h  | 12 
 2 files changed, 14 insertions(+)

-- 
2.20.1



[RFC/RFT PATCH 1/2] crypto: arm64/aes - align output IV with generic CBC-CTS driver

2020-05-19 Thread Ard Biesheuvel
The generic CTS chaining mode wraps the CBC mode driver in a way that
results in the IV buffer referenced by the skcipher request to be
updated with the last block of ciphertext. The arm64 implementation
deviates from this, given that CTS itself does not specify the concept
of an output IV, or how it should be generated, and so it was assumed
that the output IV does not matter.

However, Stephan reports that code exists that relies on this behavior,
and that there is even a NIST validation tool that flags it as
non-compliant [citation needed. Stephan?]

So let's align with the generic implementation here, and return the
penultimate block of ciphertext as the output IV.

Signed-off-by: Ard Biesheuvel 
---
 arch/arm64/crypto/aes-modes.S | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S
index cf618d8f6cec..80832464df50 100644
--- a/arch/arm64/crypto/aes-modes.S
+++ b/arch/arm64/crypto/aes-modes.S
@@ -275,6 +275,7 @@ AES_FUNC_START(aes_cbc_cts_encrypt)
add x4, x0, x4
st1 {v0.16b}, [x4]  /* overlapping stores */
st1 {v1.16b}, [x0]
+   st1 {v1.16b}, [x5]
ret
 AES_FUNC_END(aes_cbc_cts_encrypt)
 
@@ -291,6 +292,7 @@ AES_FUNC_START(aes_cbc_cts_decrypt)
ld1 {v1.16b}, [x1]
 
ld1 {v5.16b}, [x5]  /* get iv */
+   st1 {v0.16b}, [x5]
dec_prepare w3, x2, x6
 
decrypt_block   v0, w3, x2, x6, w7
-- 
2.20.1



Re: ARM CE: CTS IV handling

2020-05-19 Thread Ard Biesheuvel
On Tue, 19 May 2020 at 19:50, Ard Biesheuvel  wrote:
>
> On Tue, 19 May 2020 at 19:35, Stephan Mueller  wrote:
> >
> > Am Dienstag, 19. Mai 2020, 18:21:01 CEST schrieb Ard Biesheuvel:
> >
> > Hi Ard,
> >
> > >
> > > To be honest, this looks like the API is being used incorrectly. Is
> > > this a similar issue to the one Herbert spotted recently with the CTR
> > > code?
> > >
> > > When you say 'leaving the TFM untouched' do you mean the skcipher
> > > request? The TFM should not retain any per-request state in the first
> > > place.
> > >
> > > The skcipher request struct is not meant to retain any state either -
> > > the API simply does not support incremental encryption if the input is
> > > not a multiple of the chunksize.
> > >
> > > Could you give some sample code on how you are using the API in this case?
> >
> > What I am doing technically is to allocate a new tfm and request at the
> > beginning and then reuse the TFM and request. In that sense, I think I 
> > violate
> > that constraint.
> >
> > But in order to implement such repetition, I can surely clear / allocate a 
> > new
> > TFM. But in order to get that right, I need the resulting IV after the 
> > cipher
> > operation.
> >
> > This IV that I get after the cipher operation completes is different 
> > between C
> > and CE.
> >
>
> So is the expected output IV simply the last block of ciphertext that
> was generated (as usual), but located before the truncated block in
> the output?

If so, does the below fix the encrypt case?

index cf618d8f6cec..22f190a44689 100644
--- a/arch/arm64/crypto/aes-modes.S
+++ b/arch/arm64/crypto/aes-modes.S
@@ -275,6 +275,7 @@ AES_FUNC_START(aes_cbc_cts_encrypt)
add x4, x0, x4
st1 {v0.16b}, [x4]  /* overlapping stores */
st1 {v1.16b}, [x0]
+   st1 {v1.16b}, [x5]
ret
 AES_FUNC_END(aes_cbc_cts_encrypt)


Re: ARM CE: CTS IV handling

2020-05-19 Thread Ard Biesheuvel
On Tue, 19 May 2020 at 19:35, Stephan Mueller  wrote:
>
> Am Dienstag, 19. Mai 2020, 18:21:01 CEST schrieb Ard Biesheuvel:
>
> Hi Ard,
>
> >
> > To be honest, this looks like the API is being used incorrectly. Is
> > this a similar issue to the one Herbert spotted recently with the CTR
> > code?
> >
> > When you say 'leaving the TFM untouched' do you mean the skcipher
> > request? The TFM should not retain any per-request state in the first
> > place.
> >
> > The skcipher request struct is not meant to retain any state either -
> > the API simply does not support incremental encryption if the input is
> > not a multiple of the chunksize.
> >
> > Could you give some sample code on how you are using the API in this case?
>
> What I am doing technically is to allocate a new tfm and request at the
> beginning and then reuse the TFM and request. In that sense, I think I violate
> that constraint.
>
> But in order to implement such repetition, I can surely clear / allocate a new
> TFM. But in order to get that right, I need the resulting IV after the cipher
> operation.
>
> This IV that I get after the cipher operation completes is different between C
> and CE.
>

So is the expected output IV simply the last block of ciphertext that
was generated (as usual), but located before the truncated block in
the output?


Re: ARM CE: CTS IV handling

2020-05-19 Thread Ard Biesheuvel
(+ Eric)

Hi Stephan,

On Tue, 19 May 2020 at 17:31, Stephan Mueller  wrote:
>
> Hi Ard,
>
> The following report applies to kernel 5.3 as I am currently unable to test
> the latest upstream version.
>
> The CTS IV handling for cts-cbc-aes-ce and cts-cbc-aes-neon is not consistent
> with the C implementation for CTS such as cts(cbc-aes-ce) and cts(cbc-aes-
> neon).
>
> For example, assume encryption operation with the following data:
>
> iv "6CDD928D19C56A2255D1EC16CAA2CCCB"
> pt
> "2D6BFE335F45EED1C3C404CAA5CA4D41FF2B8C6DE94C706B10F1D207972DE6599C92E117E3CBF61F"
> key "930E9D4E65DB121E05F11A16E408AE82"
>
> When you perform one encryption operation, all 4 ciphes return:
>
> 022edfa38975b09b295e1958efde2104be1e8e70c81340adfbdf431d5c80e77b89df5997aa96af72
>
> Now, when you leave the TFM untouched (i.e. retain the IV state) and simply
> set the following new pt:
>
> 6cdd928d19c56a2255d1ec16caa2cccb022edfa38975b09b295e1958efde2104be1e8e70c81340ad
>
> the C CTS implementations return
>
> 35d54eb425afe7438c5e96b61b061f04df85a322942210568c20a5e78856c79c0af021f3e0650863
>
> But the cts-cbc-aes-ce and cts-cbc-aes-neon return
>
> a62f57efbe9d815aaf1b6c62f78a31da8ef46e5d401eaf48c261bcf889e6910abbc65c2bf26add9f
>
>
> My hunch is that the internal IV handling is different. I am aware that CTS
> does not exactly specify how the IV should look like after the encryption
> operation, but using the NIST reference implementation of ACVP, the C CTS
> implementation is considered to be OK whereas the ARM CE assembler
> implementation is considered to be not OK.
>
> Bottom line, feeding plaintext in chunks into the ARM CE assembler
> implementation will yield a different output than the C implementation.
>

To be honest, this looks like the API is being used incorrectly. Is
this a similar issue to the one Herbert spotted recently with the CTR
code?

When you say 'leaving the TFM untouched' do you mean the skcipher
request? The TFM should not retain any per-request state in the first
place.

The skcipher request struct is not meant to retain any state either -
the API simply does not support incremental encryption if the input is
not a multiple of the chunksize.

Could you give some sample code on how you are using the API in this case?


Re: linux-next: manual merge of the sound-asoc tree with the crypto tree

2020-05-14 Thread Ard Biesheuvel
On Tue, 12 May 2020 at 22:31, Arnd Bergmann  wrote:
>
> On Tue, May 12, 2020 at 10:08 PM Eric Biggers  wrote:
> > On Tue, May 12, 2020 at 06:08:01PM +0100, Mark Brown wrote:
> >
> > For later: if SHASH_DESC_ON_STACK is causing problems, we really ought to 
> > find a
> > better solution, since lots of users are using this macro.  A version of
> > crypto_shash_tfm_digest() that falls back to heap allocation if the 
> > descsize is
> > too large would be possible, but that wouldn't fully solve the problem since
> > some users do incremental hashing.
>
> It's hard to know how many of the users of SHASH_DESC_ON_STACK() are
> likely to cause problems, as multiple factors are involved:
>
> - this one triggered the warning because it was on the stack of a function
>   that got inlined into another that has other large variables. Whether it
>   got inlined makes little difference to the stack usage, but does make a
>   difference to warning about it.
>
> - generally the structure is larger than we like it, especially on 
> architectures
>   with 128 byte CRYPTO_MINALIGN like ARM. This actually got worse
>   because of b68a7ec1e9a3 ("crypto: hash - Remove VLA usage"), as
>   the stack usage is now always the maximum of all hashes where it used
>   to be specific to the hash that was actually used and could be smaller
>
> - the specific instance in calculate_sha256() feels a bit silly, as this
>   function allocates a tfm and a descriptor, runs the digest and then
>   frees both again. I don't know how common this pattern is, but
>   it seems a higher-level abstraction might be helpful anyway.
>

We are trying to move to crypto library interfaces for non-performance
critical uses of hashes where the algorithm is known at compile time,
and this is a good example of that pattern.

IOW, this code should just call the sha256_init/update/final routines directly.

I'll send out a patch.


Re: [PATCH 0/7] sha1 library cleanup

2020-05-03 Thread Ard Biesheuvel
On Sat, 2 May 2020 at 20:28, Eric Biggers  wrote:
>
>  sounds very generic and important, like it's the
> header to include if you're doing cryptographic hashing in the kernel.
> But actually it only includes the library implementation of the SHA-1
> compression function (not even the full SHA-1).  This should basically
> never be used anymore; SHA-1 is no longer considered secure, and there
> are much better ways to do cryptographic hashing in the kernel.
>
> Also the function is named just "sha_transform()", which makes it
> unclear which version of SHA is meant.
>
> Therefore, this series cleans things up by moving these SHA-1
> declarations into  where they better belong, and changing
> the names to say SHA-1 rather than just SHA.
>
> As future work, we should split sha.h into sha1.h and sha2.h and try to
> remove the remaining uses of SHA-1.  For example, the remaining use in
> drivers/char/random.c is probably one that can be gotten rid of.
>
> This patch series applies to cryptodev/master.
>
> Eric Biggers (7):
>   mptcp: use SHA256_BLOCK_SIZE, not SHA_MESSAGE_BYTES
>   crypto: powerpc/sha1 - remove unused temporary workspace
>   crypto: powerpc/sha1 - prefix the "sha1_" functions
>   crypto: s390/sha1 - prefix the "sha1_" functions
>   crypto: lib/sha1 - rename "sha" to "sha1"
>   crypto: lib/sha1 - remove unnecessary includes of linux/cryptohash.h
>   crypto: lib/sha1 - fold linux/cryptohash.h into crypto/sha.h
>

For the series,

Acked-by: Ard Biesheuvel 

>  Documentation/security/siphash.rst  |  2 +-
>  arch/arm/crypto/sha1_glue.c |  1 -
>  arch/arm/crypto/sha1_neon_glue.c|  1 -
>  arch/arm/crypto/sha256_glue.c   |  1 -
>  arch/arm/crypto/sha256_neon_glue.c  |  1 -
>  arch/arm/kernel/armksyms.c  |  1 -
>  arch/arm64/crypto/sha256-glue.c |  1 -
>  arch/arm64/crypto/sha512-glue.c |  1 -
>  arch/microblaze/kernel/microblaze_ksyms.c   |  1 -
>  arch/mips/cavium-octeon/crypto/octeon-md5.c |  1 -
>  arch/powerpc/crypto/md5-glue.c  |  1 -
>  arch/powerpc/crypto/sha1-spe-glue.c |  1 -
>  arch/powerpc/crypto/sha1.c  | 33 ++---
>  arch/powerpc/crypto/sha256-spe-glue.c   |  1 -
>  arch/s390/crypto/sha1_s390.c| 12 
>  arch/sparc/crypto/md5_glue.c|  1 -
>  arch/sparc/crypto/sha1_glue.c   |  1 -
>  arch/sparc/crypto/sha256_glue.c |  1 -
>  arch/sparc/crypto/sha512_glue.c |  1 -
>  arch/unicore32/kernel/ksyms.c   |  1 -
>  arch/x86/crypto/sha1_ssse3_glue.c   |  1 -
>  arch/x86/crypto/sha256_ssse3_glue.c |  1 -
>  arch/x86/crypto/sha512_ssse3_glue.c |  1 -
>  crypto/sha1_generic.c   |  5 ++--
>  drivers/char/random.c   |  8 ++---
>  drivers/crypto/atmel-sha.c  |  1 -
>  drivers/crypto/chelsio/chcr_algo.c  |  1 -
>  drivers/crypto/chelsio/chcr_ipsec.c |  1 -
>  drivers/crypto/omap-sham.c  |  1 -
>  fs/f2fs/hash.c  |  1 -
>  include/crypto/sha.h| 10 +++
>  include/linux/cryptohash.h  | 14 -
>  include/linux/filter.h  |  4 +--
>  include/net/tcp.h   |  1 -
>  kernel/bpf/core.c   | 18 +--
>  lib/crypto/chacha.c |  1 -
>  lib/sha1.c  | 24 ---
>  net/core/secure_seq.c   |  1 -
>  net/ipv6/addrconf.c | 10 +++
>  net/ipv6/seg6_hmac.c|  1 -
>  net/mptcp/crypto.c  |  4 +--
>  41 files changed, 69 insertions(+), 104 deletions(-)
>  delete mode 100644 include/linux/cryptohash.h
>
>
> base-commit: 12b3cf9093542d9f752a4968815ece836159013f
> --
> 2.26.2
>


Re: [PATCH 00/20] crypto: introduce crypto_shash_tfm_digest()

2020-05-03 Thread Ard Biesheuvel
On Sat, 2 May 2020 at 07:33, Eric Biggers  wrote:
>
> This series introduces a helper function crypto_shash_tfm_digest() which
> replaces the following common pattern:
>
> {
> SHASH_DESC_ON_STACK(desc, tfm);
> int err;
>
> desc->tfm = tfm;
>
> err = crypto_shash_digest(desc, data, len, out);
>
> shash_desc_zero(desc);
> }
>
> with:
>
> err = crypto_shash_tfm_digest(tfm, data, len, out);
>
> Patch 1 introduces this helper function, and patches 2-20 convert all
> relevant users to use it.
>
> IMO, it would be easiest to take all these patches through the crypto
> tree.  But taking just the "crypto:" ones and then me trying to get the
> rest merged later via subsystem trees is also an option.
>
> Eric Biggers (20):
>   crypto: hash - introduce crypto_shash_tfm_digest()
>   crypto: arm64/aes-glue - use crypto_shash_tfm_digest()
>   crypto: essiv - use crypto_shash_tfm_digest()
>   crypto: artpec6 - use crypto_shash_tfm_digest()
>   crypto: ccp - use crypto_shash_tfm_digest()
>   crypto: ccree - use crypto_shash_tfm_digest()
>   crypto: hisilicon/sec2 - use crypto_shash_tfm_digest()
>   crypto: mediatek - use crypto_shash_tfm_digest()
>   crypto: n2 - use crypto_shash_tfm_digest()
>   crypto: omap-sham - use crypto_shash_tfm_digest()
>   crypto: s5p-sss - use crypto_shash_tfm_digest()
>   nfc: s3fwrn5: use crypto_shash_tfm_digest()
>   fscrypt: use crypto_shash_tfm_digest()
>   ecryptfs: use crypto_shash_tfm_digest()
>   nfsd: use crypto_shash_tfm_digest()
>   ubifs: use crypto_shash_tfm_digest()
>   Bluetooth: use crypto_shash_tfm_digest()
>   sctp: use crypto_shash_tfm_digest()
>   KEYS: encrypted: use crypto_shash_tfm_digest()
>   ASoC: cros_ec_codec: use crypto_shash_tfm_digest()
>

For the series,

Acked-by: Ard Biesheuvel 


>  arch/arm64/crypto/aes-glue.c   |  4 +--
>  crypto/essiv.c |  4 +--
>  crypto/shash.c | 16 +
>  drivers/crypto/axis/artpec6_crypto.c   | 10 ++
>  drivers/crypto/ccp/ccp-crypto-sha.c|  9 ++---
>  drivers/crypto/ccree/cc_cipher.c   |  9 ++---
>  drivers/crypto/hisilicon/sec2/sec_crypto.c |  5 ++-
>  drivers/crypto/mediatek/mtk-sha.c  |  7 ++--
>  drivers/crypto/n2_core.c   |  7 ++--
>  drivers/crypto/omap-sham.c | 20 +++
>  drivers/crypto/s5p-sss.c   | 39 --
>  drivers/nfc/s3fwrn5/firmware.c | 10 +-
>  fs/crypto/fname.c  |  7 +---
>  fs/crypto/hkdf.c   |  6 +---
>  fs/ecryptfs/crypto.c   | 17 +-
>  fs/nfsd/nfs4recover.c  | 26 ---
>  fs/ubifs/auth.c| 20 ++-
>  fs/ubifs/master.c  |  9 ++---
>  fs/ubifs/replay.c  | 14 ++--
>  include/crypto/hash.h  | 19 +++
>  net/bluetooth/smp.c|  6 +---
>  net/sctp/auth.c| 10 ++
>  net/sctp/sm_make_chunk.c   | 23 +
>  security/keys/encrypted-keys/encrypted.c   | 18 ++
>  sound/soc/codecs/cros_ec_codec.c   |  9 +
>  25 files changed, 95 insertions(+), 229 deletions(-)
>
> --
> 2.26.2
>


Re: [PATCH] crypto: lib/aes - move IRQ disabling into AES library

2020-05-02 Thread Ard Biesheuvel
On Sat, 2 May 2020 at 20:44, Eric Biggers  wrote:
>
> From: Eric Biggers 
>
> The AES library code (which originally came from crypto/aes_ti.c) is
> supposed to be constant-time, to the extent possible for a C
> implementation.  But the hardening measure of disabling interrupts while
> the S-box is loaded into cache was not included in the library version;
> it was left only in the crypto API wrapper in crypto/aes_ti.c.
>
> Move this logic into the library version so that everyone gets it.
>

I don't think we should fiddle with interrupts in a general purpose
crypto library.

We /could/ add a variant aes_encrypt_irq_off() if you really want, but
this is not something you should get without asking explicitly imo.



> Fixes: e59c1c987456 ("crypto: aes - create AES library based on the fixed 
> time AES code")
> Cc:  # v5.4+
> Cc: Ard Biesheuvel 
> Signed-off-by: Eric Biggers 
> ---
>  crypto/aes_ti.c  | 18 --
>  lib/crypto/aes.c | 18 ++
>  2 files changed, 18 insertions(+), 18 deletions(-)
>
> diff --git a/crypto/aes_ti.c b/crypto/aes_ti.c
> index 205c2c257d4926..121f36621d6dcf 100644
> --- a/crypto/aes_ti.c
> +++ b/crypto/aes_ti.c
> @@ -20,33 +20,15 @@ static int aesti_set_key(struct crypto_tfm *tfm, const u8 
> *in_key,
>  static void aesti_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
>  {
> const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
> -   unsigned long flags;
> -
> -   /*
> -* Temporarily disable interrupts to avoid races where cachelines are
> -* evicted when the CPU is interrupted to do something else.
> -*/
> -   local_irq_save(flags);
>
> aes_encrypt(ctx, out, in);
> -
> -   local_irq_restore(flags);
>  }
>
>  static void aesti_decrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
>  {
> const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
> -   unsigned long flags;
> -
> -   /*
> -* Temporarily disable interrupts to avoid races where cachelines are
> -* evicted when the CPU is interrupted to do something else.
> -*/
> -   local_irq_save(flags);
>
> aes_decrypt(ctx, out, in);
> -
> -   local_irq_restore(flags);
>  }
>
>  static struct crypto_alg aes_alg = {
> diff --git a/lib/crypto/aes.c b/lib/crypto/aes.c
> index 827fe89922fff0..029d8d0eac1f6e 100644
> --- a/lib/crypto/aes.c
> +++ b/lib/crypto/aes.c
> @@ -260,6 +260,7 @@ void aes_encrypt(const struct crypto_aes_ctx *ctx, u8 
> *out, const u8 *in)
> const u32 *rkp = ctx->key_enc + 4;
> int rounds = 6 + ctx->key_length / 4;
> u32 st0[4], st1[4];
> +   unsigned long flags;
> int round;
>
> st0[0] = ctx->key_enc[0] ^ get_unaligned_le32(in);
> @@ -267,6 +268,12 @@ void aes_encrypt(const struct crypto_aes_ctx *ctx, u8 
> *out, const u8 *in)
> st0[2] = ctx->key_enc[2] ^ get_unaligned_le32(in + 8);
> st0[3] = ctx->key_enc[3] ^ get_unaligned_le32(in + 12);
>
> +   /*
> +* Temporarily disable interrupts to avoid races where cachelines are
> +* evicted when the CPU is interrupted to do something else.
> +*/
> +   local_irq_save(flags);
> +
> /*
>  * Force the compiler to emit data independent Sbox references,
>  * by xoring the input with Sbox values that are known to add up
> @@ -297,6 +304,8 @@ void aes_encrypt(const struct crypto_aes_ctx *ctx, u8 
> *out, const u8 *in)
> put_unaligned_le32(subshift(st1, 1) ^ rkp[5], out + 4);
> put_unaligned_le32(subshift(st1, 2) ^ rkp[6], out + 8);
> put_unaligned_le32(subshift(st1, 3) ^ rkp[7], out + 12);
> +
> +   local_irq_restore(flags);
>  }
>  EXPORT_SYMBOL(aes_encrypt);
>
> @@ -311,6 +320,7 @@ void aes_decrypt(const struct crypto_aes_ctx *ctx, u8 
> *out, const u8 *in)
> const u32 *rkp = ctx->key_dec + 4;
> int rounds = 6 + ctx->key_length / 4;
> u32 st0[4], st1[4];
> +   unsigned long flags;
> int round;
>
> st0[0] = ctx->key_dec[0] ^ get_unaligned_le32(in);
> @@ -318,6 +328,12 @@ void aes_decrypt(const struct crypto_aes_ctx *ctx, u8 
> *out, const u8 *in)
> st0[2] = ctx->key_dec[2] ^ get_unaligned_le32(in + 8);
> st0[3] = ctx->key_dec[3] ^ get_unaligned_le32(in + 12);
>
> +   /*
> +* Temporarily disable interrupts to avoid races where cachelines are
> +* evicted when the CPU is interrupted to do something else.
> +*/
> +   local_irq_save(flags);
> +
> /*
>  * Force the compiler to emit data independe

Re: [PATCH v4 25/35] crypto: BLAKE2s - x86_64 SIMD implementation

2019-10-23 Thread Ard Biesheuvel
On Wed, 23 Oct 2019 at 16:08, Jason A. Donenfeld  wrote:
>
> On Wed, Oct 23, 2019 at 6:55 AM Eric Biggers  wrote:
> > There are no comments in this 685-line assembly language file.
> > Is this the original version, or is it a generated/stripped version?
>
> It looks like Ard forgot to import the latest one from Zinc, which is
> significantly shorter and has other improvements too:
>
> https://git.zx2c4.com/WireGuard/tree/src/crypto/zinc/blake2s/blake2s-x86_64.S

I can pick that up for v5. But that doesn't address Eric's question though.


Re: [RFC PATCH 4/5] crypto: ccp - add TEE support for Raven Ridge

2019-10-23 Thread Ard Biesheuvel
(+ Jens)

On Wed, 23 Oct 2019 at 13:27, Thomas, Rijo-john
 wrote:
>
> Adds a PCI device entry for Raven Ridge. Raven Ridge is an APU with a
> dedicated AMD Secure Processor having Trusted Execution Environment (TEE)
> support. The TEE provides a secure environment for running Trusted
> Applications (TAs) which implement security-sensitive parts of a feature.
>
> This patch configures AMD Secure Processor's TEE interface by initializing
> a ring buffer (shared memory between Rich OS and Trusted OS) which can hold
> multiple command buffer entries. The TEE interface is facilitated by a set
> of CPU to PSP mailbox registers.
>
> The next patch will address how commands are submitted to the ring buffer.
>
> Signed-off-by: Rijo Thomas 
> Signed-off-by: Devaraj Rangasamy 
> ---
>  drivers/crypto/ccp/Makefile  |   3 +-
>  drivers/crypto/ccp/psp-dev.c |  74 +-
>  drivers/crypto/ccp/psp-dev.h |   8 ++
>  drivers/crypto/ccp/sp-dev.h  |  11 +-
>  drivers/crypto/ccp/sp-pci.c  |  27 -
>  drivers/crypto/ccp/tee-dev.c | 237 
> +++
>  drivers/crypto/ccp/tee-dev.h | 108 
>  7 files changed, 461 insertions(+), 7 deletions(-)
>  create mode 100644 drivers/crypto/ccp/tee-dev.c
>  create mode 100644 drivers/crypto/ccp/tee-dev.h
>

How does this patch tie into the TEE subsystem we have in drivers/tee?




> diff --git a/drivers/crypto/ccp/Makefile b/drivers/crypto/ccp/Makefile
> index 3b29ea4..db362fe 100644
> --- a/drivers/crypto/ccp/Makefile
> +++ b/drivers/crypto/ccp/Makefile
> @@ -9,7 +9,8 @@ ccp-$(CONFIG_CRYPTO_DEV_SP_CCP) += ccp-dev.o \
>  ccp-$(CONFIG_CRYPTO_DEV_CCP_DEBUGFS) += ccp-debugfs.o
>  ccp-$(CONFIG_PCI) += sp-pci.o
>  ccp-$(CONFIG_CRYPTO_DEV_SP_PSP) += psp-dev.o \
> -   sev-dev.o
> +   sev-dev.o \
> +   tee-dev.o
>
>  obj-$(CONFIG_CRYPTO_DEV_CCP_CRYPTO) += ccp-crypto.o
>  ccp-crypto-objs := ccp-crypto-main.o \
> diff --git a/drivers/crypto/ccp/psp-dev.c b/drivers/crypto/ccp/psp-dev.c
> index ef8affa..90bcd5f 100644
> --- a/drivers/crypto/ccp/psp-dev.c
> +++ b/drivers/crypto/ccp/psp-dev.c
> @@ -13,6 +13,7 @@
>  #include "sp-dev.h"
>  #include "psp-dev.h"
>  #include "sev-dev.h"
> +#include "tee-dev.h"
>
>  struct psp_device *psp_master;
>
> @@ -45,6 +46,9 @@ static irqreturn_t psp_irq_handler(int irq, void *data)
> if (status) {
> if (psp->sev_irq_handler)
> psp->sev_irq_handler(irq, psp->sev_irq_data, status);
> +
> +   if (psp->tee_irq_handler)
> +   psp->tee_irq_handler(irq, psp->tee_irq_data, status);
> }
>
> /* Clear the interrupt status by writing the same value we read. */
> @@ -53,10 +57,11 @@ static irqreturn_t psp_irq_handler(int irq, void *data)
> return IRQ_HANDLED;
>  }
>
> -static int psp_check_sev_support(struct psp_device *psp)
> +static int psp_check_sev_support(struct psp_device *psp,
> +unsigned int capability)
>  {
> /* Check if device supports SEV feature */
> -   if (!(ioread32(psp->io_regs + psp->vdata->feature_reg) & 1)) {
> +   if (!(capability & 1)) {
> dev_dbg(psp->dev, "psp does not support SEV\n");
> return -ENODEV;
> }
> @@ -64,10 +69,54 @@ static int psp_check_sev_support(struct psp_device *psp)
> return 0;
>  }
>
> +static int psp_check_tee_support(struct psp_device *psp,
> +unsigned int capability)
> +{
> +   /* Check if device supports TEE feature */
> +   if (!(capability & 2)) {
> +   dev_dbg(psp->dev, "psp does not support TEE\n");
> +   return -ENODEV;
> +   }
> +
> +   return 0;
> +}
> +
> +static int psp_check_support(struct psp_device *psp, unsigned int capability)
> +{
> +   int sev_support = psp_check_sev_support(psp, capability);
> +   int tee_support = psp_check_tee_support(psp, capability);
> +
> +   /* Check if device supprts SEV and TEE feature */
> +   if (sev_support && tee_support)
> +   return -ENODEV;
> +
> +   return 0;
> +}
> +
> +static int psp_init(struct psp_device *psp, unsigned int capability)
> +{
> +   int ret;
> +
> +   if (!psp_check_sev_support(psp, capability)) {
> +   ret = sev_dev_init(psp);
> +   if (ret)
> +   return ret;
> +   }
> +
> +   if (!psp_check_tee_support(psp, capability)) {
> +   ret = tee_dev_init(psp);
> +   if (ret)
> +   return ret;
> +   }
> +
> +   return 0;
> +}
> +
>  int psp_dev_init(struct sp_device *sp)
>  {
> struct device *dev = sp->dev;
> struct psp_device *psp;
> +   unsigned int capability;
> int ret;
>
> ret = -ENOMEM;
> @@ -86,7 +135,10 @@ int psp_dev_init(struct sp_device *sp)
>
> psp->io_regs 

Re: [RFC PATCH 1/5] crypto: ccp - rename psp-dev files to sev-dev

2019-10-23 Thread Ard Biesheuvel
Hello Thomas,

On Wed, 23 Oct 2019 at 13:27, Thomas, Rijo-john
 wrote:
>
> This is a preliminary patch for creating a generic PSP device driver
> file, which will have support for both SEV and TEE (Trusted Execution
> Environment) interface.
>
> This patch does not introduce any new functionality, but simply renames
> psp-dev.c and psp-dev.h files to sev-dev.c and sev-dev.h files
> respectively.
>
> Signed-off-by: Rijo Thomas 
> Signed-off-by: Devaraj Rangasamy 

This is not the correct way to credit a co-author.

You are sending the patch, so your signoff should come last.

If Devaraj is a co-author of this work, you should add the following
lines *before* your signoff

Co-authored-by: Devaraj Rangasamy 
Signed-off-by: Devaraj Rangasamy 

If Devaraj is the sole author of this work, and you are just sending
it out, you should set the authorship on the patch to Devaraj (so it
will be From: Devaraj Rangasamy )

> ---
>  drivers/crypto/ccp/Makefile  |2 +-
>  drivers/crypto/ccp/psp-dev.c | 1087 
> --
>  drivers/crypto/ccp/psp-dev.h |   66 ---
>  drivers/crypto/ccp/sev-dev.c | 1087 
> ++
>  drivers/crypto/ccp/sev-dev.h |   66 +++
>  drivers/crypto/ccp/sp-pci.c  |2 +-
>  6 files changed, 1155 insertions(+), 1155 deletions(-)
>  delete mode 100644 drivers/crypto/ccp/psp-dev.c
>  delete mode 100644 drivers/crypto/ccp/psp-dev.h
>  create mode 100644 drivers/crypto/ccp/sev-dev.c
>  create mode 100644 drivers/crypto/ccp/sev-dev.h
>

Please regenerate the patch so that the rename is reflected in the diffstat.


[PATCH] crypto: ecdh - fix big endian bug in ECC library

2019-10-23 Thread Ard Biesheuvel
The elliptic curve arithmetic library used by the EC-DH KPP implementation
assumes big endian byte order, and unconditionally reverses the byte
and word order of multi-limb quantities. On big endian systems, the byte
reordering is not necessary, while the word ordering needs to be retained.

So replace the __swab64() invocation with a call to be64_to_cpu() which
should do the right thing for both little and big endian builds.

Cc:  # v4.9+
Signed-off-by: Ard Biesheuvel 
---
 crypto/ecc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/crypto/ecc.c b/crypto/ecc.c
index dfe114bc0c4a..8ee787723c5c 100644
--- a/crypto/ecc.c
+++ b/crypto/ecc.c
@@ -1284,10 +1284,11 @@ EXPORT_SYMBOL(ecc_point_mult_shamir);
 static inline void ecc_swap_digits(const u64 *in, u64 *out,
   unsigned int ndigits)
 {
+   const __be64 *src = (__force __be64 *)in;
int i;
 
for (i = 0; i < ndigits; i++)
-   out[i] = __swab64(in[ndigits - 1 - i]);
+   out[i] = be64_to_cpu(src[ndigits - 1 - i]);
 }
 
 static int __ecc_is_key_valid(const struct ecc_curve *curve,
-- 
2.20.1



Re: [PATCH v6 0/2] BLAKE2b generic implementation

2019-10-23 Thread Ard Biesheuvel
On Wed, 23 Oct 2019 at 02:12, David Sterba  wrote:
>
> The patchset adds blake2b reference implementation and test vectors.
>
> V6:
>
> Patch 2/2: test vectors fixed to actually match the proposed table of
> key and plaintext combinations. I shamelessly copied the test vector
> value format that Ard uses for the blake2s test vectors. The array
> blake2b_ordered_sequence can be shared between 2s and 2b but as the
> patchsets go separate, unification would have to happen once both
> are merged.
>
> Tested on x86_64 with KASAN and SLUB_DEBUG.
>

Tested-by: Ard Biesheuvel  # arm64 big-endian


> V1: 
> https://lore.kernel.org/linux-crypto/cover.1569849051.git.dste...@suse.com/
> V2: 
> https://lore.kernel.org/linux-crypto/e31c2030fcfa7f409b2c81adf8f179a8a55a584a.1570184333.git.dste...@suse.com/
> V3: 
> https://lore.kernel.org/linux-crypto/e7f46def436c2c705c0b2cac3324f817efa4717d.1570715842.git.dste...@suse.com/
> V4: 
> https://lore.kernel.org/linux-crypto/cover.1570812094.git.dste...@suse.com/
> V5: 
> https://lore.kernel.org/linux-crypto/cover.1571043883.git.dste...@suse.com/
>
> David Sterba (2):
>   crypto: add blake2b generic implementation
>   crypto: add test vectors for blake2b
>
>  crypto/Kconfig   |  17 ++
>  crypto/Makefile  |   1 +
>  crypto/blake2b_generic.c | 413 +++
>  crypto/testmgr.c |  28 +++
>  crypto/testmgr.h | 307 +
>  include/crypto/blake2b.h |  46 +

Final nit: do we need this header file at all? Could we move the
contents into crypto/blake2b_generic.c? Or is the btrfs code going to
#include it?



>  6 files changed, 812 insertions(+)
>  create mode 100644 crypto/blake2b_generic.c
>  create mode 100644 include/crypto/blake2b.h
>
> --
> 2.23.0
>


Re: [PATCH v2] crypto: arm64/aes-neonbs - add return value of skcipher_walk_done() in __xts_crypt()

2019-10-22 Thread Ard Biesheuvel
On Tue, 22 Oct 2019 at 09:28, Yunfeng Ye  wrote:
>
> A warning is found by the static code analysis tool:
>   "Identical condition 'err', second condition is always false"
>
> Fix this by adding return value of skcipher_walk_done().
>
> Fixes: 67cfa5d3b721 ("crypto: arm64/aes-neonbs - implement ciphertext 
> stealing for XTS")
> Signed-off-by: Yunfeng Ye 

Acked-by: Ard Biesheuvel 

> ---
> v1 -> v2:
>  - update the subject and comment
>  - add return value of skcipher_walk_done()
>
>  arch/arm64/crypto/aes-neonbs-glue.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm64/crypto/aes-neonbs-glue.c 
> b/arch/arm64/crypto/aes-neonbs-glue.c
> index ea873b8904c4..e3e27349a9fe 100644
> --- a/arch/arm64/crypto/aes-neonbs-glue.c
> +++ b/arch/arm64/crypto/aes-neonbs-glue.c
> @@ -384,7 +384,7 @@ static int __xts_crypt(struct skcipher_request *req, bool 
> encrypt,
> goto xts_tail;
>
> kernel_neon_end();
> -   skcipher_walk_done(&walk, nbytes);
> +   err = skcipher_walk_done(&walk, nbytes);
> }
>
> if (err || likely(!tail))
> --
> 2.7.4.3
>


Re: [PATCH] crypto: arm64/aes-neonbs - remove redundant code in __xts_crypt()

2019-10-21 Thread Ard Biesheuvel
On Tue, 22 Oct 2019 at 08:42, Yunfeng Ye  wrote:
>
> A warning is found by the static code analysis tool:
>   "Identical condition 'err', second condition is always false"
>
> Fix this by removing the redundant condition @err.
>
> Signed-off-by: Yunfeng Ye 

Please don't blindly 'fix' crypto code without reading it carefully
and without cc'ing the author.

The correct fix is to add the missing assignment of 'err', like so

diff --git a/arch/arm64/crypto/aes-neonbs-glue.c
b/arch/arm64/crypto/aes-neonbs-glue.c
index ea873b8904c4..e3e27349a9fe 100644
--- a/arch/arm64/crypto/aes-neonbs-glue.c
+++ b/arch/arm64/crypto/aes-neonbs-glue.c
@@ -384,7 +384,7 @@ static int __xts_crypt(struct skcipher_request
*req, bool encrypt,
goto xts_tail;

kernel_neon_end();
-   skcipher_walk_done(&walk, nbytes);
+   err = skcipher_walk_done(&walk, nbytes);
}

if (err || likely(!tail))

Does that make the warning go away?


> ---
>  arch/arm64/crypto/aes-neonbs-glue.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm64/crypto/aes-neonbs-glue.c 
> b/arch/arm64/crypto/aes-neonbs-glue.c
> index ea873b8904c4..7b342db428b0 100644
> --- a/arch/arm64/crypto/aes-neonbs-glue.c
> +++ b/arch/arm64/crypto/aes-neonbs-glue.c
> @@ -387,7 +387,7 @@ static int __xts_crypt(struct skcipher_request *req, bool 
> encrypt,
> skcipher_walk_done(&walk, nbytes);
> }
>
> -   if (err || likely(!tail))
> +   if (likely(!tail))
> return err;
>
> /* handle ciphertext stealing */
> --
> 2.7.4.3
>
>
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


Re: Key endianness?

2019-10-21 Thread Ard Biesheuvel
On Mon, 21 Oct 2019 at 21:14, Pascal Van Leeuwen
 wrote:
>
> > -Original Message-
> > From: Ard Biesheuvel 
> > Sent: Monday, October 21, 2019 6:15 PM
> > To: Pascal Van Leeuwen 
> > Cc: linux-crypto@vger.kernel.org; herb...@gondor.apana.org.au
> > Subject: Re: Key endianness?
> >
> > On Mon, 21 Oct 2019 at 17:55, Pascal Van Leeuwen
> >  wrote:
> > >
> > > > -Original Message-
> > > > From: Ard Biesheuvel 
> > > > Sent: Monday, October 21, 2019 5:32 PM
> > > > To: Pascal Van Leeuwen 
> > > > Cc: linux-crypto@vger.kernel.org; herb...@gondor.apana.org.au
> > > > Subject: Re: Key endianness?
> > > >
> > > > p[
> > > >
> > > > On Mon, 21 Oct 2019 at 17:23, Pascal Van Leeuwen
> > > >  wrote:
> > > > >
> > > > > > -Original Message-
> > > > > > From: Ard Biesheuvel 
> > > > > > Sent: Monday, October 21, 2019 2:54 PM
> > > > > > To: Pascal Van Leeuwen 
> > > > > > Cc: linux-crypto@vger.kernel.org; herb...@gondor.apana.org.au
> > > > > > Subject: Re: Key endianness?
> > > > > >
> > > > > > On Mon, 21 Oct 2019 at 14:40, Pascal Van Leeuwen
> > > > > >  wrote:
> > > > > > >
> > > > > > > > -Original Message-
> > > > > > > > From: Ard Biesheuvel 
> > > > > > > > Sent: Monday, October 21, 2019 1:59 PM
> > > > > > > > To: Pascal Van Leeuwen 
> > > > > > > > Cc: linux-crypto@vger.kernel.org; herb...@gondor.apana.org.au
> > > > > > > > Subject: Re: Key endianness?
> > > > > > > >
> > > > > > > > On Mon, 21 Oct 2019 at 12:56, Pascal Van Leeuwen
> > > > > > > >  wrote:
> > > > > > > > >
> > > > > > > > > Another endianness question:
> > > > > > > > >
> > > > > > > > > I have some data structure that can be either little or big 
> > > > > > > > > endian,
> > > > > > > > > depending on the exact use case. Currently, I have it defined 
> > > > > > > > > as u32.
> > > > > > > > > This causes sparse errors when accessing it using 
> > > > > > > > > cpu_to_Xe32() and
> > > > > > > > > Xe32_to_cpu().
> > > > > > > > >
> > > > > > > > > Now, for the big endian case, I could use htonl()/ntohl() 
> > > > > > > > > instead,
> > > > > > > > > but this is inconsistent with all other endian conversions in 
> > > > > > > > > the
> > > > > > > > > driver ... and there's no little endian alternative I'm aware 
> > > > > > > > > of.
> > > > > > > > > So I don't really like that approach.
> > > > > > > > >
> > > > > > > > > Alternatively, I could define a union of both a big and little
> > > > > > > > > endian version of the data but that would require touching a 
> > > > > > > > > lot
> > > > > > > > > of legacy code (unless I use a C11 anonymous union ... not 
> > > > > > > > > sure
> > > > > > > > > if that would be allowed?) and IMHO is a bit silly.
> > > > > > > > >
> > > > > > > > > Is there some way of telling sparse to _not_ check for 
> > > > > > > > > "correct"
> > > > > > > > > use of these functions for a certain variable?
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > In this case, just use (__force __Xe32*) to cast it to the 
> > > > > > > > correct
> > > > > > > > type. This annotates the cast as being intentionally 
> > > > > > > > endian-unclean,
> > > > > > > > and shuts up Sparse.
> > > > > > > >
> > > > > > > Thanks for trying to help out, but that just gives me an
> > > > > > > "error: not an lvalue" from both 

Re: Key endianness?

2019-10-21 Thread Ard Biesheuvel
On Mon, 21 Oct 2019 at 17:55, Pascal Van Leeuwen
 wrote:
>
> > -Original Message-
> > From: Ard Biesheuvel 
> > Sent: Monday, October 21, 2019 5:32 PM
> > To: Pascal Van Leeuwen 
> > Cc: linux-crypto@vger.kernel.org; herb...@gondor.apana.org.au
> > Subject: Re: Key endianness?
> >
> > p[
> >
> > On Mon, 21 Oct 2019 at 17:23, Pascal Van Leeuwen
> >  wrote:
> > >
> > > > -Original Message-
> > > > From: Ard Biesheuvel 
> > > > Sent: Monday, October 21, 2019 2:54 PM
> > > > To: Pascal Van Leeuwen 
> > > > Cc: linux-crypto@vger.kernel.org; herb...@gondor.apana.org.au
> > > > Subject: Re: Key endianness?
> > > >
> > > > On Mon, 21 Oct 2019 at 14:40, Pascal Van Leeuwen
> > > >  wrote:
> > > > >
> > > > > > -Original Message-
> > > > > > From: Ard Biesheuvel 
> > > > > > Sent: Monday, October 21, 2019 1:59 PM
> > > > > > To: Pascal Van Leeuwen 
> > > > > > Cc: linux-crypto@vger.kernel.org; herb...@gondor.apana.org.au
> > > > > > Subject: Re: Key endianness?
> > > > > >
> > > > > > On Mon, 21 Oct 2019 at 12:56, Pascal Van Leeuwen
> > > > > >  wrote:
> > > > > > >
> > > > > > > Another endianness question:
> > > > > > >
> > > > > > > I have some data structure that can be either little or big 
> > > > > > > endian,
> > > > > > > depending on the exact use case. Currently, I have it defined as 
> > > > > > > u32.
> > > > > > > This causes sparse errors when accessing it using cpu_to_Xe32() 
> > > > > > > and
> > > > > > > Xe32_to_cpu().
> > > > > > >
> > > > > > > Now, for the big endian case, I could use htonl()/ntohl() instead,
> > > > > > > but this is inconsistent with all other endian conversions in the
> > > > > > > driver ... and there's no little endian alternative I'm aware of.
> > > > > > > So I don't really like that approach.
> > > > > > >
> > > > > > > Alternatively, I could define a union of both a big and little
> > > > > > > endian version of the data but that would require touching a lot
> > > > > > > of legacy code (unless I use a C11 anonymous union ... not sure
> > > > > > > if that would be allowed?) and IMHO is a bit silly.
> > > > > > >
> > > > > > > Is there some way of telling sparse to _not_ check for "correct"
> > > > > > > use of these functions for a certain variable?
> > > > > > >
> > > > > >
> > > > > >
> > > > > > In this case, just use (__force __Xe32*) to cast it to the correct
> > > > > > type. This annotates the cast as being intentionally endian-unclean,
> > > > > > and shuts up Sparse.
> > > > > >
> > > > > Thanks for trying to help out, but that just gives me an
> > > > > "error: not an lvalue" from both sparse and GCC.
> > > > > But I'm probably doing it wrong somehow ...
> > > > >
> > > >
> > > > It depends on what you are casting. But doing something like
> > > >
> > > > u32 l = ...
> > > > __le32 ll = (__force __le32)l
> > > >
> > > > should not trigger a sparse warning.
> > > >
> > > I was actually casting the left side, not the right side,
> > > as that's where my sparse issue was. Must be my poor grasp
> > > of the C language hurting me here as I don't understand why
> > > I'm not allowed to cast an array element to a different type
> > > of the _same size_ ...
> > >
> > > i.e. why can't I do (__be32)some_u32_array[3] = cpu_to_be32(some_value)?
> > >
> >
> > Because you can only change the type of an expression by casting, and
> > an lvalue is not an expression. A variable has a type already, and you
> > cannot cast that away - what would that mean, exactly? Would all
> > occurrences of some_u32_array[] suddenly have a different type? Or
> > only element [3]?
> >
> I think it would be perfectly logical to do such a cast and I'm really
> surprised tha

Re: Key endianness?

2019-10-21 Thread Ard Biesheuvel
p[

On Mon, 21 Oct 2019 at 17:23, Pascal Van Leeuwen
 wrote:
>
> > -Original Message-
> > From: Ard Biesheuvel 
> > Sent: Monday, October 21, 2019 2:54 PM
> > To: Pascal Van Leeuwen 
> > Cc: linux-crypto@vger.kernel.org; herb...@gondor.apana.org.au
> > Subject: Re: Key endianness?
> >
> > On Mon, 21 Oct 2019 at 14:40, Pascal Van Leeuwen
> >  wrote:
> > >
> > > > -Original Message-
> > > > From: Ard Biesheuvel 
> > > > Sent: Monday, October 21, 2019 1:59 PM
> > > > To: Pascal Van Leeuwen 
> > > > Cc: linux-crypto@vger.kernel.org; herb...@gondor.apana.org.au
> > > > Subject: Re: Key endianness?
> > > >
> > > > On Mon, 21 Oct 2019 at 12:56, Pascal Van Leeuwen
> > > >  wrote:
> > > > >
> > > > > Another endianness question:
> > > > >
> > > > > I have some data structure that can be either little or big endian,
> > > > > depending on the exact use case. Currently, I have it defined as u32.
> > > > > This causes sparse errors when accessing it using cpu_to_Xe32() and
> > > > > Xe32_to_cpu().
> > > > >
> > > > > Now, for the big endian case, I could use htonl()/ntohl() instead,
> > > > > but this is inconsistent with all other endian conversions in the
> > > > > driver ... and there's no little endian alternative I'm aware of.
> > > > > So I don't really like that approach.
> > > > >
> > > > > Alternatively, I could define a union of both a big and little
> > > > > endian version of the data but that would require touching a lot
> > > > > of legacy code (unless I use a C11 anonymous union ... not sure
> > > > > if that would be allowed?) and IMHO is a bit silly.
> > > > >
> > > > > Is there some way of telling sparse to _not_ check for "correct"
> > > > > use of these functions for a certain variable?
> > > > >
> > > >
> > > >
> > > > In this case, just use (__force __Xe32*) to cast it to the correct
> > > > type. This annotates the cast as being intentionally endian-unclean,
> > > > and shuts up Sparse.
> > > >
> > > Thanks for trying to help out, but that just gives me an
> > > "error: not an lvalue" from both sparse and GCC.
> > > But I'm probably doing it wrong somehow ...
> > >
> >
> > It depends on what you are casting. But doing something like
> >
> > u32 l = ...
> > __le32 ll = (__force __le32)l
> >
> > should not trigger a sparse warning.
> >
> I was actually casting the left side, not the right side,
> as that's where my sparse issue was. Must be my poor grasp
> of the C language hurting me here as I don't understand why
> I'm not allowed to cast an array element to a different type
> of the _same size_ ...
>
> i.e. why can't I do (__be32)some_u32_array[3] = cpu_to_be32(some_value)?
>

Because you can only change the type of an expression by casting, and
an lvalue is not an expression. A variable has a type already, and you
cannot cast that away - what would that mean, exactly? Would all
occurrences of some_u32_array[] suddenly have a different type? Or
only element [3]?


> I managed to work around it by doing *(__be32 *)&some_u32_array[3] =
> but that's pretty ugly ... a better approach is still welcome.
>

You need to cast the right hand side, not the left hand side. If
some_u32_array is u32[], force cast it to (__force u32)

> >
> > > > > Regards,
> > > > > Pascal van Leeuwen
> > > > > Silicon IP Architect, Multi-Protocol Engines @ Verimatrix
> > > > > www.insidesecure.com
> > > > >
> > > > > > -Original Message-
> > > > > > From: Pascal Van Leeuwen
> > > > > > Sent: Monday, October 21, 2019 11:04 AM
> > > > > > To: linux-crypto@vger.kernel.org; herb...@gondor.apana.org.au
> > > > > > Subject: Key endianness?
> > > > > >
> > > > > > Herbert,
> > > > > >
> > > > > > I'm currently busy fixing some endianness related sparse errors 
> > > > > > reported
> > > > > > by this kbuild test robot and this triggered my to rethink some 
> > > > > > endian
> > > > > > conversion being done in the inside-secure driver.
> > > > &

<    1   2   3   4   5   6   7   8   9   10   >