Re: [PATCH 0/5] crypto: Speck support

2018-02-09 Thread Jeffrey Walton
On Thu, Feb 8, 2018 at 4:01 PM, Eric Biggers  wrote:
> On Wed, Feb 07, 2018 at 08:47:05PM -0500, Jeffrey Walton wrote:
>> On Wed, Feb 7, 2018 at 7:09 PM, Eric Biggers  wrote:
>> > Hello,
>> >
>> > This series adds Speck support to the crypto API, including the Speck128
>> > and Speck64 variants.  Speck is a lightweight block cipher that can be
>> > much faster than AES on processors that don't have AES instructions.
>> >
>> > We are planning to offer Speck-XTS (probably Speck128/256-XTS) as an
>> > option for dm-crypt and fscrypt on Android, for low-end mobile devices
>> > with older CPUs such as ARMv7 which don't have the Cryptography
>> > Extensions.  Currently, such devices are unencrypted because AES is not
>> > fast enough, even when the NEON bit-sliced implementation of AES is
>> > used.  Other AES alternatives such as Blowfish, Twofish, Camellia,
>> > Cast6, and Serpent aren't fast enough either; it seems that only a
>> > modern ARX cipher can provide sufficient performance on these devices.
>> >
>> > This is a replacement for our original proposal
>> > (https://patchwork.kernel.org/patch/10101451/) which was to offer
>> > ChaCha20 for these devices.  However, the use of a stream cipher for
>> > disk/file encryption with no space to store nonces would have been much
>> > more insecure than we thought initially, given that it would be used on
>> > top of flash storage as well as potentially on top of F2FS, neither of
>> > which is guaranteed to overwrite data in-place.
>> >
>> > ...
>> > Thus, patch 1 adds a generic implementation of Speck, and the following
>> > patches add a 32-bit ARM NEON implementation of Speck-XTS.  The
>> > NEON-accelerated implementation is much faster than the generic
>> > implementation and therefore is the implementation that would primarily
>> > be used in practice on the devices we are targeting.
>> >
>> > There is no AArch64 implementation added, since such CPUs are likely to
>> > have the Cryptography Extensions, allowing the use of AES.
>>
>> +1 on SPECK.
>> ...
>
> Hi Jeffrey,
>
> I see you wrote the SPECK implementation in Crypto++, and you are treating the
> words as big endian.
>
> Do you have a reference for this being the "correct" order?  Unfortunately the
> authors of the cipher failed to mention the byte order in their paper.  And 
> they
> gave the test vectors as words, so the test vectors don't clarify it either.
>
> I had assumed little endian words, but now I am having second thoughts...  And
> to confuse things further, it seems that some implementations (including the
> authors own implementation for the SUPERCOP benchmark toolkit [1]) even 
> consider
> the words themselves in the order (y, x) rather than the more intuitive (x, 
> y).
>
> [1] 
> https://github.com/iadgov/simon-speck-supercop/blob/master/crypto_stream/speck128128ctr/ref/stream.c
>
> In fact, even the reference code from the paper treats pt[0] as y and pt[1] as
> x, where 'pt' is a u64 array -- although that being said, it's not shown how 
> the
> actual bytes should be translated to/from those u64 arrays.
>
> I'd really like to avoid people having to add additional versions of SPECK 
> later
> for the different byte and word orders...

Hi Eric,

Yeah, this was a point of confusion for us as well. After the sidebar
conversations I am wondering about the correctness of Crypto++
implementation.

As a first step here is the official test vector for Speck-128(128)
from Appendix C, p. 42 (https://eprint.iacr.org/2013/404.pdf):

Speck128/128
Key: 0f0e0d0c0b0a0908 0706050403020100
Plaintext: 6c61766975716520 7469206564616d20
Ciphertext: a65d985179783265 7860fedf5c570d18

We had some confusion over the presentation. Here is what the Simon
and Speck team sent when I asked about it, what gets plugged into the
algorithm, and how it gets plugged in:



On Mon, Nov 20, 2017 at 10:50 AM,  wrote:
> ...
> I'll explain the problem you have been having with our test vectors.
>
> The key is:  0x0f0e0d0c0b0a0908 0x0706050403020100
> The plaintext is:  6c61766975716520 7469206564616d20
> The ciphertext is:  a65d985179783265 7860fedf5c570d18
>
> The problem is essentially one of what goes where and we probably could
> have done a better job explaining things.
>
> For the key, with two words, K=(K[1],K[0]).  With three words 
> K=(K[2],K[1],K[0]),
> with four words K=(K[3],K[2],K[1],K[0]).
>
> So for the test vector you should have K[0]= 0x0706050403020100, K[1]= 
> 0x0f0e0d0c0b0a0908
> which is the opposite of what you have done.
>
> If we put this K into ExpandKey(K,sk) then the first few round keys
> are:
>
> 0706050403020100
> 37253b31171d0309
> f91d89cc90c4085c
> c6b1f07852cc7689
> ...
>
> For the plaintext, P=(P[1],P[0]), i.e., P[1] goes into the left word of the 
> block cipher
> and P[0] goes into the right word of the block cipher.  So you should have
> m[0]= 7469206564616d20 and m[1]= 6c61766975716520, which is again 

[PATCH v3 3/4] crypto: AF_ALG - allow driver to serialize IV access

2018-02-09 Thread Stephan Müller
The mutex in AF_ALG to serialize access to the IV ensures full
serialization of requests sent to the crypto driver.

However, the hardware may implement serialization to the IV such that
preparation work without touching the IV can already happen while the IV
is processed by another operation. This may speed up the AIO processing.

The following ASCII art demonstrates this.

AF_ALG mutex handling implements the following logic:

[REQUEST 1 from userspace]   [REQUEST 2 from userspace]
 ||
 [AF_ALG/SOCKET]   [AF_ALG/SOCKET]
 ||
NOTHING BLOCKING (lock mutex) |
 | Queued on Mutex
 [BUILD / MAP HW DESCS]   |
 ||
   [Place in HW Queue]|
 ||
 [Process]|
 ||
[Interrupt]   |
 ||
 [Respond and update IV]  |
 ||
[unlock mutex]   Nothing Blocking (lock mutex)
 ||
Don't care beyond here.   [BUILD / MAP HW DESCS]
  |
  [Place in HW Queue]
  |
  [Process]
  |
 [Interrupt]
  |
[Respond and update IV]
  |
Don't care beyond here.

A crypto driver may implement the following serialization:

[REQUEST 1 from userspace]   [REQUEST 2 from userspace]
 ||
  [AF_ALG/SOCKET]  [AF_ALG/SOCKET]
 ||
  [BUILD / MAP HW DESCS] [BUILD / MAP HW DESCS]
 ||
NOTHING BLOCKING IV in flight (enqueue sw queue)
 ||
   [Place in HW Queue]|
 ||
 [Process]|
 ||
[Interrupt]   |
 ||
 [Respond and update IV] Nothing Blocking (dequeue sw queue)
 ||
  Don't care beyond here.  [Place in HW Queue]
  |
  [Process]
  |
 [Interrupt]
  |
[Respond and update IV]
  |
Don't care beyond here.

If the driver implements its own serialization (i.e. AF_ALG does not
serialize the access to the IV), the crypto implementation must set the
flag CRYPTO_ALG_SERIALIZES_IV_ACCESS.

Signed-off-by: Stephan Mueller 
---
 crypto/af_alg.c | 13 -
 crypto/algif_aead.c |  1 +
 crypto/algif_skcipher.c |  1 +
 include/crypto/if_alg.h | 13 +
 include/linux/crypto.h  | 15 +++
 5 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 7f80dcfc12a6..56b4da65025a 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -1247,8 +1247,10 @@ int af_alg_get_iv(struct sock *sk, struct 
af_alg_async_req *areq)
return 0;
 
/* No inline IV, use ctx IV buffer and lock it */
-   if (ctx->iiv == ALG_IV_SERIAL_PROCESSING)
-   return mutex_lock_interruptible(>ivlock);
+   if (ctx->iiv == ALG_IV_SERIAL_PROCESSING) {
+   return (ctx->lock_iv) ?
+   mutex_lock_interruptible(>ivlock) : 0;
+   }
 
/* Inline IV handling: There must be the IV data present. */
if (ctx->used < ctx->ivlen || list_empty(>tsgl_list))
@@ -1280,12 +1282,13 @@ void af_alg_put_iv(struct sock *sk)
struct alg_sock *ask = alg_sk(sk);
struct af_alg_ctx *ctx = ask->private;
 
-   /* To resemble af_alg_get_iv, do not merge the two branches. */
if (!ctx->ivlen)
return;
 
-   if (ctx->iiv == ALG_IV_SERIAL_PROCESSING)
-   mutex_unlock(>ivlock);
+   if (ctx->iiv == ALG_IV_SERIAL_PROCESSING) {
+   if (ctx->lock_iv)
+   mutex_unlock(>ivlock);
+   }
 }
 EXPORT_SYMBOL_GPL(af_alg_put_iv);
 
diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c
index 4c425effc5a5..619147792cc9 100644
--- a/crypto/algif_aead.c
+++ b/crypto/algif_aead.c
@@ -565,6 

[PATCH v3 0/4] crypto: AF_ALG AIO improvements

2018-02-09 Thread Stephan Müller
Hi,

Herbert, the patch 1 is meant for stable. However, this patch as is
only applies to the new AF_ALG interface implementation. Though,
the issue goes back to the first implementation of AIO support.
Shall I try prepare a patch for the old AF_ALG implementation
as well?

Changes from v2:

* rename ALG_IIV flags into ALG_IV_...

* rename CRYPTO_TFM_REQ_PARALLEL into CRYPTO_TFM_REQ_IV_SERIALIZE

* fix branch in patch 4 to add CRYPTO_TFM_REQ_IV_SERIALIZE flag when
  ctx->iiv == ALG_IV_SERIAL_PROCESSING

* fix patch description of patch 4

Stephan Mueller (4):
  crypto: AF_ALG AIO - lock context IV
  crypto: AF_ALG - inline IV support
  crypto: AF_ALG - allow driver to serialize IV access
  crypto: add CRYPTO_TFM_REQ_IV_SERIALIZE flag

 crypto/af_alg.c | 119 +++-
 crypto/algif_aead.c |  86 +---
 crypto/algif_skcipher.c |  38 ++
 include/crypto/if_alg.h |  37 ++
 include/linux/crypto.h  |  16 ++
 include/uapi/linux/if_alg.h |   6 ++-
 6 files changed, 249 insertions(+), 53 deletions(-)

-- 
2.14.3






[PATCH v3 1/4] crypto: AF_ALG AIO - lock context IV

2018-02-09 Thread Stephan Müller
The kernel crypto API requires the caller to set an IV in the request data
structure. That request data structure shall define one particular cipher
operation. During the cipher operation, the IV is read by the cipher
implementation and eventually the potentially updated IV (e.g. in case of
CBC) is written back to the memory location the request data structure
points to.

AF_ALG allows setting the IV with a sendmsg request, where the IV is stored
in the AF_ALG context that is unique to one particular AF_ALG socket. Note
the analogy: an AF_ALG socket is like a TFM where one recvmsg operation
uses one request with the TFM from the socket.

AF_ALG these days supports AIO operations with multiple IOCBs. I.e. with
one recvmsg call, multiple IOVECs can be specified. Each individual IOCB
(derived from one IOVEC) implies that one request data structure is
created with the data to be processed by the cipher implementation. The
IV that was set with the sendmsg call is registered with the request data
structure before the cipher operation.

In case of an AIO operation, the cipher operation invocation returns
immediately, queuing the request to the hardware. While the AIO request is
processed by the hardware, recvmsg processes the next IOVEC for which
another request is created. Again, the IV buffer from the AF_ALG socket
context is registered with the new request and the cipher operation is
invoked.

You may now see that there is a potential race condition regarding the IV
handling, because there is *no* separate IV buffer for the different
requests. This is nicely demonstrated with libkcapi using the following
command which creates an AIO request with two IOCBs each encrypting one
AES block in CBC mode:

kcapi  -d 2 -x 9  -e -c "cbc(aes)" -k
8d7dd9b0170ce0b5f2f8e1aa768e01e91da8bfc67fd486d081b28254c99eb423 -i
7fbc02ebf5b93322329df9bfccb635af -p 48981da18e4bb9ef7e2e3162d16b1910

When the first AIO request finishes before the 2nd AIO request is
processed, the returned value is:

8b19050f66582cb7f7e4b6c873819b7108afa0eaa7de29bac7d903576b674c32

I.e. two blocks where the IV output from the first request is the IV input
to the 2nd block.

In case the first AIO request is not completed before the 2nd request
commences, the result is two identical AES blocks (i.e. both use the same
IV):

8b19050f66582cb7f7e4b6c873819b718b19050f66582cb7f7e4b6c873819b71

This inconsistent result may even lead to the conclusion that there can be
a memory corruption in the IV buffer if both AIO requests write to the IV
buffer at the same time.

As the AF_ALG interface is used by user space, a mutex provides the
serialization which puts the caller to sleep in case a previous IOCB
processing is not yet finished.

If multiple IOCBs arrive that all are blocked, the mutex' FIFO operation
of processing arriving requests ensures that the blocked IOCBs are
unblocked in the right order.

CC:  #4.14
Fixes: e870456d8e7c8 ("crypto: algif_skcipher - overhaul memory management")
Fixes: d887c52d6ae43 ("crypto: algif_aead - overhaul memory management")
Signed-off-by: Stephan Mueller 
---
 crypto/af_alg.c | 31 +++
 crypto/algif_aead.c | 20 +---
 crypto/algif_skcipher.c | 12 +---
 include/crypto/if_alg.h |  5 +
 4 files changed, 58 insertions(+), 10 deletions(-)

diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 5231f421ad00..e7887621aa44 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -1051,6 +1051,8 @@ void af_alg_async_cb(struct crypto_async_request *_req, 
int err)
struct kiocb *iocb = areq->iocb;
unsigned int resultlen;
 
+   af_alg_put_iv(sk);
+
/* Buffer size written by crypto operation. */
resultlen = areq->outlen;
 
@@ -1175,6 +1177,35 @@ int af_alg_get_rsgl(struct sock *sk, struct msghdr *msg, 
int flags,
 }
 EXPORT_SYMBOL_GPL(af_alg_get_rsgl);
 
+/**
+ * af_alg_get_iv
+ *
+ * @sk [in] AF_ALG socket
+ * @return 0 on success, < 0 on error
+ */
+int af_alg_get_iv(struct sock *sk)
+{
+   struct alg_sock *ask = alg_sk(sk);
+   struct af_alg_ctx *ctx = ask->private;
+
+   return mutex_lock_interruptible(>ivlock);
+}
+EXPORT_SYMBOL_GPL(af_alg_get_iv);
+
+/**
+ * af_alg_put_iv - release lock on IV in case CTX IV is used
+ *
+ * @sk [in] AF_ALG socket
+ */
+void af_alg_put_iv(struct sock *sk)
+{
+   struct alg_sock *ask = alg_sk(sk);
+   struct af_alg_ctx *ctx = ask->private;
+
+   mutex_unlock(>ivlock);
+}
+EXPORT_SYMBOL_GPL(af_alg_put_iv);
+
 static int __init af_alg_init(void)
 {
int err = proto_register(_proto, 0);
diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c
index 4b07edd5a9ff..402de50d4a58 100644
--- a/crypto/algif_aead.c
+++ b/crypto/algif_aead.c
@@ -159,10 +159,14 @@ static int _aead_recvmsg(struct socket *sock, struct 
msghdr *msg,
if (IS_ERR(areq))
return PTR_ERR(areq);
 
+   err = af_alg_get_iv(sk);
+   if (err)
+

[PATCH v3 2/4] crypto: AF_ALG - inline IV support

2018-02-09 Thread Stephan Müller
The kernel crypto API requires the caller to set an IV in the request
data structure. That request data structure shall define one particular
cipher operation. During the cipher operation, the IV is read by the
cipher implementation and eventually the potentially updated IV (e.g.
in case of CBC) is written back to the memory location the request data
structure points to.

AF_ALG allows setting the IV with a sendmsg request, where the IV is
stored in the AF_ALG context that is unique to one particular AF_ALG
socket. Note the analogy: an AF_ALG socket is like a TFM where one
recvmsg operation uses one request with the TFM from the socket.

AF_ALG these days supports AIO operations with multiple IOCBs. I.e.
with one recvmsg call, multiple IOVECs can be specified. Each
individual IOCB (derived from one IOVEC) implies that one request data
structure is created with the data to be processed by the cipher
implementation. The IV that was set with the sendmsg call is registered
with the request data structure before the cipher operation.

As of now, the individual IOCBs are serialized with respect to the IV
handling. This implies that the kernel does not perform a truly parallel
invocation of the cipher implementations. However, if the user wants to
perform cryptographic operations on multiple IOCBs where each IOCB is
truly independent from the other, parallel invocations are possible.
This would require that each IOCB provides its own IV to ensure true
separation of the IOCBs.

The solution is to allow providing the IV data supplied as part of the
plaintext/ciphertext. To do so, the AF_ALG interface treats the
ALG_SET_OP flag usable with sendmsg as a bit-array allowing to set the
cipher operation together with the flag whether the operation should
enable support for inline IV handling.

If inline IV handling is enabled, the IV is expected to be the first
part of the input plaintext/ciphertext. This IV is only used for one
cipher operation and will not retained in the kernel for subsequent
cipher operations.

The inline IV handling support is only allowed to be enabled during
the first sendmsg call for a context. Any subsequent sendmsg calls are
not allowed to change the setting of the inline IV handling (either
enable or disable it) as this would open up a race condition with the
locking and unlocking of the ctx->ivlock mutex.

The AEAD support required a slight re-arragning of the code, because
obtaining the IV implies that ctx->used is updated. Thus, the ctx->used
access in _aead_recvmsg must be moved below the IV gathering.

The AEAD code to find the first SG with data in the TX SGL is moved to a
common function as it is required by the IV gathering function as well.

This patch does not change the existing interface where user space is
allowed to provide an IV via sendmsg. It only extends the interface by
giving the user the choice to provide the IV either via sendmsg (the
current approach) or as part of the data (the additional approach).

Signed-off-by: Stephan Mueller 
---
 crypto/af_alg.c | 93 ++---
 crypto/algif_aead.c | 58 
 crypto/algif_skcipher.c | 12 +++---
 include/crypto/if_alg.h | 21 +-
 include/uapi/linux/if_alg.h |  6 ++-
 5 files changed, 143 insertions(+), 47 deletions(-)

diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index e7887621aa44..7f80dcfc12a6 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -14,6 +14,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -834,6 +835,7 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, 
size_t size,
struct af_alg_control con = {};
long copied = 0;
bool enc = 0;
+   int iiv = ALG_IV_SERIAL_PROCESSING;
bool init = 0;
int err = 0;
 
@@ -843,7 +845,7 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, 
size_t size,
return err;
 
init = 1;
-   switch (con.op) {
+   switch (con.op & ALG_OP_CIPHER_MASK) {
case ALG_OP_ENCRYPT:
enc = 1;
break;
@@ -854,6 +856,9 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, 
size_t size,
return -EINVAL;
}
 
+   if (con.op & ALG_OP_INLINE_IV)
+   iiv = ALG_IV_PARALLEL_PROCESSING;
+
if (con.iv && con.iv->ivlen != ivsize)
return -EINVAL;
}
@@ -866,6 +871,19 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr 
*msg, size_t size,
 
if (init) {
ctx->enc = enc;
+
+   /*
+* IIV can only be enabled once with the first sendmsg call.
+* This prevents a race in locking and unlocking the
+* ctx->ivlock mutex.
+*/
+   if (ctx->iiv == 

[PATCH v3 2/3] MIPS: crypto: Add crc32 and crc32c hw accelerated module

2018-02-09 Thread James Hogan
From: Marcin Nowakowski 

This module registers crc32 and crc32c algorithms that use the
optional CRC32[bhwd] and CRC32C[bhwd] instructions in MIPSr6 cores.

Signed-off-by: Marcin Nowakowski 
Signed-off-by: James Hogan 
Cc: Ralf Baechle 
Cc: Herbert Xu 
Cc: "David S. Miller" 
Cc: linux-m...@linux-mips.org
Cc: linux-crypto@vger.kernel.org
---
Changes in v3:
 - Convert to using assembler macros to support CRC instructions on
   older toolchains, using the helpers merged for 4.16. This removes the
   need to hardcode either rt or rs (i.e. as $v0 (CRC_REGISTER) and
   $at), and drops the C "register" keywords sprinkled everywhere.
 - Minor whitespace rearrangement of _CRC32 macro.
 - Add SPDX-License-Identifier to crc32-mips.c and the crypo Makefile.
 - Update copyright from ImgTec to MIPS Tech, LLC.
 - Update imgtec.com email addresses to mips.com.

Changes in v2:
 - minor code refactoring as suggested by JamesH which produces
   a better assembly output for 32-bit builds
---
 arch/mips/Kconfig |   4 +-
 arch/mips/Makefile|   3 +-
 arch/mips/crypto/Makefile |   6 +-
 arch/mips/crypto/crc32-mips.c | 346 +++-
 crypto/Kconfig|   9 +-
 5 files changed, 368 insertions(+)
 create mode 100644 arch/mips/crypto/Makefile
 create mode 100644 arch/mips/crypto/crc32-mips.c

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index ac0f5bb10f0b..cccd17c07bfc 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -2023,6 +2023,7 @@ config CPU_MIPSR6
select CPU_HAS_RIXI
select HAVE_ARCH_BITREVERSE
select MIPS_ASID_BITS_VARIABLE
+   select MIPS_CRC_SUPPORT
select MIPS_SPRAM
 
 config EVA
@@ -2490,6 +2491,9 @@ config MIPS_ASID_BITS
 config MIPS_ASID_BITS_VARIABLE
bool
 
+config MIPS_CRC_SUPPORT
+   bool
+
 #
 # - Highmem only makes sense for the 32-bit kernel.
 # - The current highmem code will only work properly on physically indexed
diff --git a/arch/mips/Makefile b/arch/mips/Makefile
index d1ca839c3981..44a6ed53d018 100644
--- a/arch/mips/Makefile
+++ b/arch/mips/Makefile
@@ -222,6 +222,8 @@ xpa-cflags-y:= 
$(mips-cflags)
 xpa-cflags-$(micromips-ase)+= -mmicromips 
-Wa$(comma)-fatal-warnings
 toolchain-xpa  := $(call cc-option-yn,$(xpa-cflags-y) 
-mxpa)
 cflags-$(toolchain-xpa)+= -DTOOLCHAIN_SUPPORTS_XPA
+toolchain-crc  := $(call cc-option-yn,$(mips-cflags) 
-Wa$(comma)-mcrc)
+cflags-$(toolchain-crc)+= -DTOOLCHAIN_SUPPORTS_CRC
 
 #
 # Firmware support
@@ -330,6 +332,7 @@ libs-y  += arch/mips/math-emu/
 # See arch/mips/Kbuild for content of core part of the kernel
 core-y += arch/mips/
 
+drivers-$(CONFIG_MIPS_CRC_SUPPORT) += arch/mips/crypto/
 drivers-$(CONFIG_OPROFILE) += arch/mips/oprofile/
 
 # suspend and hibernation support
diff --git a/arch/mips/crypto/Makefile b/arch/mips/crypto/Makefile
new file mode 100644
index ..e07aca572c2e
--- /dev/null
+++ b/arch/mips/crypto/Makefile
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Makefile for MIPS crypto files..
+#
+
+obj-$(CONFIG_CRYPTO_CRC32_MIPS) += crc32-mips.o
diff --git a/arch/mips/crypto/crc32-mips.c b/arch/mips/crypto/crc32-mips.c
new file mode 100644
index ..8d4122f37fa5
--- /dev/null
+++ b/arch/mips/crypto/crc32-mips.c
@@ -0,0 +1,346 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * crc32-mips.c - CRC32 and CRC32C using optional MIPSr6 instructions
+ *
+ * Module based on arm64/crypto/crc32-arm.c
+ *
+ * Copyright (C) 2014 Linaro Ltd 
+ * Copyright (C) 2018 MIPS Tech, LLC
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+enum crc_op_size {
+   b, h, w, d,
+};
+
+enum crc_type {
+   crc32,
+   crc32c,
+};
+
+#ifndef TOOLCHAIN_SUPPORTS_CRC
+#define _ASM_MACRO_CRC32(OP, SZ, TYPE)   \
+_ASM_MACRO_3R(OP, rt, rs, rt2,   \
+   ".ifnc  \\rt, \\rt2\n\t"  \
+   ".error \"invalid operands \\\"" #OP " \\rt,\\rs,\\rt2\\\"\"\n\t" \
+   ".endif\n\t"  \
+   _ASM_INSN_IF_MIPS(0x7c0f | (__rt << 16) | (__rs << 21) |  \
+ ((SZ) <<  6) | ((TYPE) << 8))   \
+   _ASM_INSN32_IF_MM(0x0030 | (__rs << 16) | (__rt << 21) |  \
+ ((SZ) << 14) | ((TYPE) << 3)))
+_ASM_MACRO_CRC32(crc32b,  0, 0);
+_ASM_MACRO_CRC32(crc32h,  1, 0);
+_ASM_MACRO_CRC32(crc32w,  2, 0);
+_ASM_MACRO_CRC32(crc32d,  3, 0);
+_ASM_MACRO_CRC32(crc32cb, 0, 1);
+_ASM_MACRO_CRC32(crc32ch, 1, 1);
+_ASM_MACRO_CRC32(crc32cw, 2, 

[PATCH v3 0/3] MIPS CRC instruction support

2018-02-09 Thread James Hogan
MIPSr6 architecture introduces a new CRC32(C) instruction. The following
patches add a crypto acceleration module for crc32 and crc32c algorithms
using the new instructions.

Changes in v3:
 - Convert to using assembler macros to support CRC instructions on
   older toolchains, using the helpers merged for 4.16. This removes the
   need to hardcode either rt or rs (i.e. as $v0 (CRC_REGISTER) and
   $at), and drops the C "register" keywords sprinkled everywhere.
 - Minor whitespace rearrangement of _CRC32 macro.
 - Add SPDX-License-Identifier to crc32-mips.c and the crypo Makefile.
 - Update copyright from ImgTec to MIPS Tech, LLC.
 - Update imgtec.com email addresses to mips.com.
 - New patch 3 to enable crc32-mips module on r6 configs.

Changes in v2:
 - minor code refactoring as suggested by JamesH which produces
   a better assembly output for 32-bit builds

Cc: Marcin Nowakowski 
Cc: Ralf Baechle 
Cc: Herbert Xu 
Cc: "David S. Miller" 
Cc: Paul Burton 
Cc: linux-m...@linux-mips.org
Cc: linux-crypto@vger.kernel.org

James Hogan (1):
  MIPS: generic: Enable crc32-mips on r6 configs

Marcin Nowakowski (2):
  MIPS: Add crc instruction support flag to elf_hwcap
  MIPS: crypto: Add crc32 and crc32c hw accelerated module

 arch/mips/Kconfig |   4 +-
 arch/mips/Makefile|   3 +-
 arch/mips/configs/generic/32r6.config |   2 +-
 arch/mips/configs/generic/64r6.config |   2 +-
 arch/mips/crypto/Makefile |   6 +-
 arch/mips/crypto/crc32-mips.c | 346 +++-
 arch/mips/include/asm/mipsregs.h  |   1 +-
 arch/mips/include/uapi/asm/hwcap.h|   1 +-
 arch/mips/kernel/cpu-probe.c  |   3 +-
 crypto/Kconfig|   9 +-
 10 files changed, 377 insertions(+)
 create mode 100644 arch/mips/crypto/Makefile
 create mode 100644 arch/mips/crypto/crc32-mips.c

base-commit: 791412dafbbfd860e78983d45cf71db603a82f67
-- 
git-series 0.9.1


[PATCH v3 4/4] crypto: add CRYPTO_TFM_REQ_IV_SERIALIZE flag

2018-02-09 Thread Stephan Müller
Crypto drivers may implement a streamlined serialization support for AIO
requests that is reported by the CRYPTO_ALG_SERIALIZES_IV_ACCESS flag to
the crypto user. When the user decides that he wants to send multiple
AIO requests concurrently and wants the crypto driver to handle the
serialization, the caller has to set CRYPTO_TFM_REQ_IV_SERIALIZE to notify
the crypto driver.

Only when this flag is enabled, the crypto driver shall apply its
serialization logic for handling IV updates between requests. If this
flag is not provided, the serialization logic shall not be applied by
the driver as the caller decides that it does not need it (because no
parallel AIO requests are sent) or that it performs its own
serialization.

Signed-off-by: Stephan Mueller 
---
 crypto/algif_aead.c | 15 ---
 crypto/algif_skcipher.c | 15 ---
 include/linux/crypto.h  |  1 +
 3 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c
index 619147792cc9..5ec4dec6e6a1 100644
--- a/crypto/algif_aead.c
+++ b/crypto/algif_aead.c
@@ -66,13 +66,22 @@ static int aead_sendmsg(struct socket *sock, struct msghdr 
*msg, size_t size)
 {
struct sock *sk = sock->sk;
struct alg_sock *ask = alg_sk(sk);
+   struct af_alg_ctx *ctx = ask->private;
struct sock *psk = ask->parent;
struct alg_sock *pask = alg_sk(psk);
struct aead_tfm *aeadc = pask->private;
-   struct crypto_aead *tfm = aeadc->aead;
-   unsigned int ivsize = crypto_aead_ivsize(tfm);
+   struct crypto_aead *aead = aeadc->aead;
+   struct crypto_tfm *tfm = crypto_aead_tfm(aead);
+   unsigned int ivsize = crypto_aead_ivsize(aead);
+   int ret = af_alg_sendmsg(sock, msg, size, ivsize);
+
+   if (ret < 0)
+   return ret;
 
-   return af_alg_sendmsg(sock, msg, size, ivsize);
+   if (ctx->iiv == ALG_IV_SERIAL_PROCESSING)
+   tfm->crt_flags |= CRYPTO_TFM_REQ_IV_SERIALIZE;
+
+   return ret;
 }
 
 static int crypto_aead_copy_sgl(struct crypto_skcipher *null_tfm,
diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index cf27dda6a181..fd2a0ba32feb 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -43,12 +43,21 @@ static int skcipher_sendmsg(struct socket *sock, struct 
msghdr *msg,
 {
struct sock *sk = sock->sk;
struct alg_sock *ask = alg_sk(sk);
+   struct af_alg_ctx *ctx = ask->private;
struct sock *psk = ask->parent;
struct alg_sock *pask = alg_sk(psk);
-   struct crypto_skcipher *tfm = pask->private;
-   unsigned ivsize = crypto_skcipher_ivsize(tfm);
+   struct crypto_skcipher *skc = pask->private;
+   struct crypto_tfm *tfm = crypto_skcipher_tfm(skc);
+   unsigned int ivsize = crypto_skcipher_ivsize(skc);
+   int ret = af_alg_sendmsg(sock, msg, size, ivsize);
+
+   if (ret < 0)
+   return ret;
 
-   return af_alg_sendmsg(sock, msg, size, ivsize);
+   if (ctx->iiv == ALG_IV_SERIAL_PROCESSING)
+   tfm->crt_flags |= CRYPTO_TFM_REQ_IV_SERIALIZE;
+
+   return ret;
 }
 
 static int _skcipher_recvmsg(struct socket *sock, struct msghdr *msg,
diff --git a/include/linux/crypto.h b/include/linux/crypto.h
index 4860aa2c9be4..4d54f2b30692 100644
--- a/include/linux/crypto.h
+++ b/include/linux/crypto.h
@@ -133,6 +133,7 @@
 #define CRYPTO_TFM_REQ_WEAK_KEY0x0100
 #define CRYPTO_TFM_REQ_MAY_SLEEP   0x0200
 #define CRYPTO_TFM_REQ_MAY_BACKLOG 0x0400
+#define CRYPTO_TFM_REQ_IV_SERIALIZE0x0800
 #define CRYPTO_TFM_RES_WEAK_KEY0x0010
 #define CRYPTO_TFM_RES_BAD_KEY_LEN 0x0020
 #define CRYPTO_TFM_RES_BAD_KEY_SCHED   0x0040
-- 
2.14.3






Re: [PATCH 0/2] sun4i_ss_prng fixes

2018-02-09 Thread Herbert Xu
On Tue, Feb 06, 2018 at 10:20:20PM +0100, Artem Savkov wrote:
> IPSec hasn't been working on my a10 board since 4.14 and it turned out to be
> caused by sun4i_ss_rng driver.
> 
> Artem Savkov (2):
>   sun4i_ss_prng: fix return value of sun4i_ss_prng_generate
>   sun4i_ss_prng: convert lock to _bh in sun4i_ss_prng_generate
> 
>  drivers/crypto/sunxi-ss/sun4i-ss-prng.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)

All applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] crypto/generic - sha3: deal with oversize stack frames

2018-02-09 Thread Herbert Xu
On Sat, Jan 27, 2018 at 09:18:32AM +, Ard Biesheuvel wrote:
> As reported by kbuild test robot, the optimized SHA3 C implementation
> compiles to mn10300 code that uses a disproportionate amount of stack
> space, i.e.,
> 
>   crypto/sha3_generic.c: In function 'keccakf':
>   crypto/sha3_generic.c:147:1: warning: the frame size of 1232 bytes is 
> larger than 1024 bytes [-Wframe-larger-than=]
> 
> As kindly diagnosed by Arnd, this does not only occur when building for
> the mn10300 architecture (which is what the report was about) but also
> for h8300, and builds for other 32-bit architectures show an increase in
> stack space utilization as well.
> 
> Given that SHA3 operates on 64-bit quantities, and keeps a state matrix
> of 25 64-bit words, it is not surprising that 32-bit architectures with
> few general purpose registers are impacted the most by this, and it is
> therefore reasonable to implement a workaround that distinguishes between
> 32-bit and 64-bit architectures.
> 
> Arnd figured out that taking the round calculation out of the loop, and
> inlining it explicitly but only on 64-bit architectures preserves most
> of the performance gain achieved by the rewrite, and also gets rid of
> the excessive use of stack space.
> 
> Reported-by: kbuild test robot 
> Suggested-by: Arnd Bergmann 
> Signed-off-by: Ard Biesheuvel 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] crypto: talitos: fix Kernel Oops on hashing an empty file

2018-02-09 Thread Herbert Xu
On Fri, Jan 26, 2018 at 05:09:59PM +0100, Christophe Leroy wrote:
> Performing the hash of an empty file leads to a kernel Oops
> 
> [   44.504600] Unable to handle kernel paging request for data at address 
> 0x000c
> [   44.512819] Faulting instruction address: 0xc02d2be8
> [   44.524088] Oops: Kernel access of bad area, sig: 11 [#1]
> [   44.529171] BE PREEMPT CMPC885
> [   44.532232] CPU: 0 PID: 491 Comm: md5sum Not tainted 
> 4.15.0-rc8-00211-g3a968610b6ea #81
> [   44.540814] NIP:  c02d2be8 LR: c02d2984 CTR: 
> [   44.545812] REGS: c6813c90 TRAP: 0300   Not tainted  
> (4.15.0-rc8-00211-g3a968610b6ea)
> [   44.554223] MSR:  9032   CR: 48222822  XER: 2000
> [   44.560855] DAR: 000c DSISR: c000
> [   44.560855] GPR00: c02d28fc c6813d40 c6828000 c646fa40 0001 0001 
> 0001 
> [   44.560855] GPR08: 004c  c000bfcc  28222822 100280d4 
>  10020008
> [   44.560855] GPR16:  0020   10024008  
> c646f9f0 c6179a10
> [   44.560855] GPR24:  0001 c62f0018 c6179a10  c6367a30 
> c62f c646f9c0
> [   44.598542] NIP [c02d2be8] ahash_process_req+0x448/0x700
> [   44.603751] LR [c02d2984] ahash_process_req+0x1e4/0x700
> [   44.608868] Call Trace:
> [   44.611329] [c6813d40] [c02d28fc] ahash_process_req+0x15c/0x700 
> (unreliable)
> [   44.618302] [c6813d90] [c02060c4] hash_recvmsg+0x11c/0x210
> [   44.623716] [c6813db0] [c0331354] ___sys_recvmsg+0x98/0x138
> [   44.629226] [c6813eb0] [c03332c0] __sys_recvmsg+0x40/0x84
> [   44.634562] [c6813f10] [c03336c0] SyS_socketcall+0xb8/0x1d4
> [   44.640073] [c6813f40] [c000d1ac] ret_from_syscall+0x0/0x38
> [   44.645530] Instruction dump:
> [   44.648465] 38c1 7f63db78 4e800421 7c791b78 54690ffe 0f09 80ff0190 
> 2f87
> [   44.656122] 40befe50 2f990001 409e0210 813f01bc <8129000c> b39e003a 
> 7d29c214 913e003c
> 
> This patch fixes that Oops by checking if src is NULL.
> 
> Fixes: 6a1e8d14156d4 ("crypto: talitos - making mapping helpers more generic")
> Cc: 
> Signed-off-by: Christophe Leroy 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] crypto: sha512-mb - initialize pending lengths correctly

2018-02-09 Thread Herbert Xu
On Wed, Jan 24, 2018 at 12:31:27AM -0800, Eric Biggers wrote:
> From: Eric Biggers 
> 
> The SHA-512 multibuffer code keeps track of the number of blocks pending
> in each lane.  The minimum of these values is used to identify the next
> lane that will be completed.  Unused lanes are set to a large number
> (0x) so that they don't affect this calculation.
> 
> However, it was forgotten to set the lengths to this value in the
> initial state, where all lanes are unused.  As a result it was possible
> for sha512_mb_mgr_get_comp_job_avx2() to select an unused lane, causing
> a NULL pointer dereference.  Specifically this could happen in the case
> where ->update() was passed fewer than SHA512_BLOCK_SIZE bytes of data,
> so it then called sha_complete_job() without having actually submitted
> any blocks to the multi-buffer code.  This hit a NULL pointer
> dereference if another task happened to have submitted blocks
> concurrently to the same CPU and the flush timer had not yet expired.
> 
> Fix this by initializing sha512_mb_mgr->lens correctly.
> 
> As usual, this bug was found by syzkaller.
> 
> Fixes: 45691e2d9b18 ("crypto: sha512-mb - submit/flush routines for AVX2")
> Reported-by: syzbot 
> Cc:  # v4.8+
> Signed-off-by: Eric Biggers 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt