Re: [PATCH v3 0/7] crypto: aes - allow generic AES to be omitted

2017-07-18 Thread Herbert Xu
On Tue, Jul 18, 2017 at 08:57:28AM +0100, Ard Biesheuvel wrote:
>
> So if you care about security and/or the cache/memory footprint more
> than about speed, you can disable the table based implementations that
> exist for i586, x86, ARM and arm64 (all of which have faster and time
> invariant implementations based on SIMD or special instructions
> anyway, so for 95% of the cases, it does not really matter).

The thing is that anybody who cares about speed won't be using
aes-generic anyway.  We have way too many AES implementations
as it is, and having two C implementations is really getting
silly.

So would it be possible for you to proceed with your work in
such a way that we end up with just aes-ti as the generic C
implementation?

As for the table-based asm implementations yes they can stay and
work out some way of sharing that table at the source-code level.
At run-time the table can just go into the asm module directly
since you'd only have one on each platform, right?

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [RFC PATCH v12 1/4] crypto: make Jitter RNG directly accessible

2017-07-18 Thread Greg Kroah-Hartman
On Tue, Jul 18, 2017 at 09:57:55AM +0200, Stephan Müller wrote:
> --- /dev/null
> +++ b/include/crypto/jitterentropy.h
> @@ -0,0 +1,80 @@
> +/*
> + * Copyright (C) 2017, Stephan Mueller 
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + * 1. Redistributions of source code must retain the above copyright
> + *notice, and the entire permission notice in its entirety,
> + *including the disclaimer of warranties.
> + * 2. Redistributions in binary form must reproduce the above copyright
> + *notice, this list of conditions and the following disclaimer in the
> + *documentation and/or other materials provided with the distribution.
> + * 3. The name of the author may not be used to endorse or promote
> + *products derived from this software without specific prior
> + *written permission.
> + *
> + * ALTERNATIVELY, this product may be distributed under the terms of
> + * the GNU General Public License, in which case the provisions of the GPL2 
> are
> + * required INSTEAD OF the above restrictions.  (This clause is
> + * necessary due to a potential bad interaction between the GPL and
> + * the restrictions contained in a BSD-style copyright.)
> + *
> + * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED
> + * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
> + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, ALL OF
> + * WHICH ARE HEREBY DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE
> + * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
> + * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT
> + * OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
> + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
> + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE
> + * USE OF THIS SOFTWARE, EVEN IF NOT ADVISED OF THE POSSIBILITY OF SUCH
> + * DAMAGE.
> + */
> +
> +#ifndef _JITTERENTROPY_H
> +#define _JITTERENTROPY_H
> +
> +typedef  unsigned long long  __u64;
> +typedef  long long   __s64;

types.h already has these defines, don't re-typedef them again...


Re: [RFC PATCH v12 2/4] random: conditionally compile code depending on LRNG

2017-07-18 Thread Arnd Bergmann
On Tue, Jul 18, 2017 at 9:58 AM, Stephan Müller  wrote:
> When selecting the LRNG for compilation, disable add_disk_randomness and
> its supporting function.
>
> CC: Greg Kroah-Hartman 
> CC: Arnd Bergmann 
> CC: Jason A. Donenfeld 
> Signed-off-by: Stephan Mueller 

I think this needs a better explanation. Why do we ignore the extra
entropy here?

   Arnd


[RFC PATCH v12 4/4] LRNG - enable compile

2017-07-18 Thread Stephan Müller
Add LRNG compilation support.

CC: Greg Kroah-Hartman 
CC: Arnd Bergmann 
CC: Jason A. Donenfeld 
Signed-off-by: Stephan Mueller 
---
 drivers/char/Kconfig  | 10 ++
 drivers/char/Makefile | 10 +-
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/char/Kconfig b/drivers/char/Kconfig
index ccd239a..88cc472 100644
--- a/drivers/char/Kconfig
+++ b/drivers/char/Kconfig
@@ -587,5 +587,15 @@ config TILE_SROM
 
 source "drivers/char/xillybus/Kconfig"
 
+config LRNG
+   bool "Linux Random Number Generator"
+   select CRYPTO_DRBG_MENU
+   select CRYPTO_CMAC if CRYPTO_DRBG_CTR
+   help
+ The Linux Random Number Generator (LRNG) is the replacement
+ of the legacy /dev/random provided with drivers/char/random.c.
+ It generates entropy from different noise sources and
+ delivers significant entropy during boot.
+
 endmenu
 
diff --git a/drivers/char/Makefile b/drivers/char/Makefile
index 53e3372..87e06ec 100644
--- a/drivers/char/Makefile
+++ b/drivers/char/Makefile
@@ -2,7 +2,15 @@
 # Makefile for the kernel character device drivers.
 #
 
-obj-y  += mem.o random.o
+obj-y  += mem.o
+
+ifeq ($(CONFIG_LRNG),y)
+  obj-$(CONFIG_LRNG)   += lrng.o
+  lrng-y   += lrng_base.o lrng_chacha20.o
+else
+  obj-y+= random.o
+endif
+
 obj-$(CONFIG_TTY_PRINTK)   += ttyprintk.o
 obj-y  += misc.o
 obj-$(CONFIG_ATARI_DSP56K) += dsp56k.o
-- 
2.9.4




[RFC PATCH v12 2/4] random: conditionally compile code depending on LRNG

2017-07-18 Thread Stephan Müller
When selecting the LRNG for compilation, disable add_disk_randomness and
its supporting function.

CC: Greg Kroah-Hartman 
CC: Arnd Bergmann 
CC: Jason A. Donenfeld 
Signed-off-by: Stephan Mueller 
---
 include/linux/genhd.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index e619fae..7e08ebc 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -422,8 +422,13 @@ extern void disk_flush_events(struct gendisk *disk, 
unsigned int mask);
 extern unsigned int disk_clear_events(struct gendisk *disk, unsigned int mask);
 
 /* drivers/char/random.c */
+#ifdef CONFIG_LRNG
+#define add_disk_randomness(disk) do {} while (0)
+#define rand_initialize_disk(disk) do {} while (0)
+#else
 extern void add_disk_randomness(struct gendisk *disk) __latent_entropy;
 extern void rand_initialize_disk(struct gendisk *disk);
+#endif
 
 static inline sector_t get_start_sect(struct block_device *bdev)
 {
-- 
2.9.4




[RFC PATCH v12 0/4] /dev/random - a new approach

2017-07-18 Thread Stephan Müller
Hi,

after distributing patches for the Linux-RNG - a new approach for the last
half year, I would like to submit this RFC for inclusion into the kernel.

The following patch set provides a different approach to /dev/random which
I call Linux Random Number Generator (LRNG) to collect entropy within the Linux
kernel. The main improvements compared to the legacy /dev/random is to provide
sufficient entropy during boot time as well as in virtual environments and when
using SSDs. A secondary design goal is to limit the impact of the entropy
collection on massive parallel systems and also allow the use accelerated
cryptographic primitives. Also, all steps of the entropic data processing are
testable.

The design and implementation is driven by a set of goals described in [1]
that the LRNG completely implements. Furthermore, [1] includes a
comparison with RNG design suggestions such as SP800-90B, SP800-90C, and
AIS20/31.

Booting the patch with the kernel command line option
"dyndbg=file drivers/char/lrng* +p" generates logs indicating the operation
of the LRNG. Each log is pre-pended with "lrng:".

The LRNG has a flexible design by allowing an easy replacement of the
deterministic random number generator component. For the request of inclusion
into the kernel, the scope of the LRNG is limited compared to the previous
patch sets. The following components are not part of the patch set and are
intended for a separate submission:

- FIPS 140-2 continuous random number generator test -- the LRNG is therefore
  not compliant with FIPS 140-2 requirements.

- Support for using the SP800-90A DRBG.

[1] http://www.chronox.de/lrng.html

Changes (compared to v11 of the previous patch set):

* use PTR_ERR_OR_ZERO as suggested by Julia Lawall

* fixed coccocinelle warnings

* fixed cppcheck style hints

* streamlined add_interrupt_randomness

* all DRNG-specific code is re-allocated to the C files specific to the
  respective DRNG

* rename all macros from DRBG -> DRNG

* rename all functions from *drbg* -> *drng*

* functions grouped into pdrng and sdrng processing for easier reading

* Use Jitter RNG to seed even the init RNG for entropy at earliest boot time
  which implies that the very first random number generated by the LRNG is
  seeded with the Jitter RNG:
[0.029839] lrng: Jitter RNG working on current system
[0.03] lrng: obtained 32 bits of entropy from Jitter RNG noise source

* incorporate wait_for_random_bytes from Jason A. Donenfeld

* incorporate invalidate_batched_entropy from Jason A. Donenfeld

* incorporate debug logs for unseeded DRNGs from Jason A. Donenfeld including
  rate limiting from Ted Ts'o

* rename lrng_standalone.c -> lrng_chacha20.c

* bug fix edge condition during reseed on NUMA systems

* enable stuck test during early boot

* When waiting for "good" random numbers, the following concept applies:
- kernel space: reaching the minimally seeded level triggers wakeup
- user space: reaching the fully seeded level triggers wakeup

* Use RDSEED for seeding operations and RDRAND as a fallback as suggested by
  DJ Johnston (note, the final fallback to use a high-resolution timer is
  implicitly present by using the time stamp unconditional for each reseed).

* conserve entropy in output function of primary DRNG

Stephan Mueller (4):
  crypto: make Jitter RNG directly accessible
  random: conditionally compile code depending on LRNG
  Linux Random Number Generator
  LRNG - enable compile

 crypto/jitterentropy.c |   33 +-
 drivers/char/Kconfig   |   10 +
 drivers/char/Makefile  |   10 +-
 drivers/char/lrng_base.c   | 2297 
 drivers/char/lrng_chacha20.c   |  291 +
 include/crypto/jitterentropy.h |   80 ++
 include/linux/genhd.h  |5 +
 7 files changed, 2694 insertions(+), 32 deletions(-)
 create mode 100644 drivers/char/lrng_base.c
 create mode 100644 drivers/char/lrng_chacha20.c
 create mode 100644 include/crypto/jitterentropy.h

-- 
2.9.4




[RFC PATCH v12 3/4] Linux Random Number Generator

2017-07-18 Thread Stephan Müller
The LRNG with the following properties:

* noise source: interrupts timing with fast boot time seeding

* lockless LFSR to collect raw entropy

* use of standalone ChaCha20 based RNG with the option to use a
  different DRNG selectable at compile time

* "atomic" seeding of secondary DRBG to ensure full entropy
  transport

* instantiate one DRNG per NUMA node

Further details including the rationale for the design choices and
properties of the LRNG together with testing is provided at [1].
In addition, the documentation explains the conducted regression
tests to verify that the LRNG is API and ABI compatible with the
legacy /dev/random implementation.

[1] http://www.chronox.de/lrng.html

CC: Greg Kroah-Hartman 
CC: Arnd Bergmann 
CC: Jason A. Donenfeld 
Signed-off-by: Stephan Mueller 
---
 drivers/char/lrng_base.c | 2297 ++
 drivers/char/lrng_chacha20.c |  291 ++
 2 files changed, 2588 insertions(+)
 create mode 100644 drivers/char/lrng_base.c
 create mode 100644 drivers/char/lrng_chacha20.c

diff --git a/drivers/char/lrng_base.c b/drivers/char/lrng_base.c
new file mode 100644
index 000..a87c3ef
--- /dev/null
+++ b/drivers/char/lrng_base.c
@@ -0,0 +1,2297 @@
+/*
+ * Linux Random Number Generator (LRNG)
+ *
+ * Documentation and test code: http://www.chronox.de/lrng.html
+ *
+ * Copyright (C) 2016 - 2017, Stephan Mueller 
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, and the entire permission notice in its entirety,
+ *including the disclaimer of warranties.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ * 3. The name of the author may not be used to endorse or promote
+ *products derived from this software without specific prior
+ *written permission.
+ *
+ * ALTERNATIVELY, this product may be distributed under the terms of
+ * the GNU General Public License, in which case the provisions of the GPL2
+ * are required INSTEAD OF the above restrictions.  (This clause is
+ * necessary due to a potential bad interaction between the GPL and
+ * the restrictions contained in a BSD-style copyright.)
+ *
+ * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED
+ * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+ * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, ALL OF
+ * WHICH ARE HEREBY DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT
+ * OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE
+ * USE OF THIS SOFTWARE, EVEN IF NOT ADVISED OF THE POSSIBILITY OF SUCH
+ * DAMAGE.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* Security strength of LRNG -- this must match DRNG security strength */
+#define LRNG_DRNG_SECURITY_STRENGTH_BYTES 32
+#define LRNG_DRNG_SECURITY_STRENGTH_BITS (LRNG_DRNG_SECURITY_STRENGTH_BYTES * 
8)
+
+#define LRNG_DRNG_BLOCKSIZE 64 /* Maximum of DRNG block sizes */
+
+/*
+ * Alignmask which should cover all cipher implementations
+ * WARNING: If this is changed to a value larger than 8, manual
+ * alignment is necessary as older versions of GCC may not be capable
+ * of aligning stack variables at boundaries greater than 8.
+ * In this case, PTR_ALIGN must be used.
+ */
+#define LRNG_KCAPI_ALIGN 8
+
+/* Primary DRNG state handle */
+struct lrng_pdrng {
+   void *pdrng;/* DRNG handle */
+   bool pdrng_fully_seeded;/* Is DRNG fully seeded? */
+   bool pdrng_min_seeded;  /* Is DRNG minimally seeded? */
+   u32 pdrng_entropy_bits; /* DRNG entropy level */
+   struct work_struct lrng_seed_work;  /* (re)seed work queue */
+   spinlock_t lock;
+};
+
+/* Secondary DRNG state handle */
+struct lrng_sdrng {
+   void *sdrng;/* DRNG handle */
+   atomic_t requests;  /* Number of DRNG requests */
+   unsigned long last_seeded;  /* Last time it was seeded */
+   bool fully_seeded;  /* Is DRNG fully seeded? */
+   

[RFC PATCH v12 1/4] crypto: make Jitter RNG directly accessible

2017-07-18 Thread Stephan Müller
To support the LRNG operation which allocates the Jitter RNG separately
from the kernel crypto API, extract the relevant information into a
separate header file.

CC: Greg Kroah-Hartman 
CC: Arnd Bergmann 
CC: Jason A. Donenfeld 
Signed-off-by: Stephan Mueller 
---
 crypto/jitterentropy.c | 33 ++---
 include/crypto/jitterentropy.h | 80 ++
 2 files changed, 82 insertions(+), 31 deletions(-)
 create mode 100644 include/crypto/jitterentropy.h

diff --git a/crypto/jitterentropy.c b/crypto/jitterentropy.c
index acf44b2..90184a1 100644
--- a/crypto/jitterentropy.c
+++ b/crypto/jitterentropy.c
@@ -54,40 +54,11 @@
  #error "The CPU Jitter random number generator must not be compiled with 
optimizations. See documentation. Use the compiler switch -O0 for compiling 
jitterentropy.c."
 #endif
 
-typedefunsigned long long  __u64;
-typedeflong long   __s64;
+#include 
+
 typedefunsigned int__u32;
 #define NULL((void *) 0)
 
-/* The entropy pool */
-struct rand_data {
-   /* all data values that are vital to maintain the security
-* of the RNG are marked as SENSITIVE. A user must not
-* access that information while the RNG executes its loops to
-* calculate the next random value. */
-   __u64 data; /* SENSITIVE Actual random number */
-   __u64 old_data; /* SENSITIVE Previous random number */
-   __u64 prev_time;/* SENSITIVE Previous time stamp */
-#define DATA_SIZE_BITS ((sizeof(__u64)) * 8)
-   __u64 last_delta;   /* SENSITIVE stuck test */
-   __s64 last_delta2;  /* SENSITIVE stuck test */
-   unsigned int stuck:1;   /* Time measurement stuck */
-   unsigned int osr;   /* Oversample rate */
-   unsigned int stir:1;/* Post-processing stirring */
-   unsigned int disable_unbias:1;  /* Deactivate Von-Neuman unbias */
-#define JENT_MEMORY_BLOCKS 64
-#define JENT_MEMORY_BLOCKSIZE 32
-#define JENT_MEMORY_ACCESSLOOPS 128
-#define JENT_MEMORY_SIZE (JENT_MEMORY_BLOCKS*JENT_MEMORY_BLOCKSIZE)
-   unsigned char *mem; /* Memory access location with size of
-* memblocks * memblocksize */
-   unsigned int memlocation; /* Pointer to byte in *mem */
-   unsigned int memblocks; /* Number of memory blocks in *mem */
-   unsigned int memblocksize; /* Size of one memory block in bytes */
-   unsigned int memaccessloops; /* Number of memory accesses per random
- * bit generation */
-};
-
 /* Flags that can be used to initialize the RNG */
 #define JENT_DISABLE_STIR (1<<0) /* Disable stirring the entropy pool */
 #define JENT_DISABLE_UNBIAS (1<<1) /* Disable the Von-Neuman Unbiaser */
diff --git a/include/crypto/jitterentropy.h b/include/crypto/jitterentropy.h
new file mode 100644
index 000..7ed8f20
--- /dev/null
+++ b/include/crypto/jitterentropy.h
@@ -0,0 +1,80 @@
+/*
+ * Copyright (C) 2017, Stephan Mueller 
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, and the entire permission notice in its entirety,
+ *including the disclaimer of warranties.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ * 3. The name of the author may not be used to endorse or promote
+ *products derived from this software without specific prior
+ *written permission.
+ *
+ * ALTERNATIVELY, this product may be distributed under the terms of
+ * the GNU General Public License, in which case the provisions of the GPL2 are
+ * required INSTEAD OF the above restrictions.  (This clause is
+ * necessary due to a potential bad interaction between the GPL and
+ * the restrictions contained in a BSD-style copyright.)
+ *
+ * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED
+ * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
+ * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, ALL OF
+ * WHICH ARE HEREBY DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT
+ * OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+ * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE
+ * USE OF THIS SOFTWARE, EVEN IF NOT ADVISED OF THE POSSIBILITY OF SUCH
+ * DAMAGE.

Re: [PATCH v3 0/7] crypto: aes - allow generic AES to be omitted

2017-07-18 Thread Ard Biesheuvel
On 18 July 2017 at 08:18, Herbert Xu  wrote:
> On Tue, Jul 18, 2017 at 07:32:41AM +0100, Ard Biesheuvel wrote:
>>
>> Because it is slower, and how much slower is architecture dependent
>> (if your arch has slow multiplication, aes-ti decryption will be dog
>> slow compared to aes-generic)
>
> Right, but does anybody actually care? My guess is that on such a
> platform aes-generic is going to be dog-slow anyway.
>

Could be, yes, but it is not trivial to find out.

>> Also, quite a few architectures have table based implementations that
>> reuse crypto_ft_tab/crypto_fl_tab etc so we'd need to factor out those
>> into a separate module if we were to remove aes-generic.
>
> You mean x86 and arm, right? Isn't the idea of your patch-set to allow
> aes-generic to be disabled on x86/arm, no?
>

Yes. The original aes-ti implemented encryption only, to provide
AES-CBCMAC, CMAC etc, i.e., algorithms that rely on encryption only,
and cannot be parallelized, the use case being 32-bit ARM, which has a
fast, NEON-based, time invariant implementation for parallel modes,
but had to fall back to the table based AES for the MAC part of CCM
etc. As suggested by Eric, AES-TI was converted into a full cipher,
allowing it to replace AES generic entirely. (I have documented the
rationale more elaborately here:
http://www.workofard.com/2017/02/time-invariant-aes/)

So if you care about security and/or the cache/memory footprint more
than about speed, you can disable the table based implementations that
exist for i586, x86, ARM and arm64 (all of which have faster and time
invariant implementations based on SIMD or special instructions
anyway, so for 95% of the cases, it does not really matter).

But I'd be happy to rework the code so that the small time invariant
routines implement the core AES code, fulfilling the existing
dependencies on CRYPTO_AES. Then, we could still provide AES generic
on top of that, or only expose the tables (I don't think we should
remove the x86/arm table based drivers altogether). This fits well
with my non-SIMD fallback changes for arm64, which invoke
crypto_aes_encrypt() directly (rather than going via
CRYPTO_ALG_NEED_FALLBACK resolution, which itself may produce another
SIMD based algo which requires a non-SIMD fallback itself)


Re: [PATCH v3 0/7] crypto: aes - allow generic AES to be omitted

2017-07-18 Thread Herbert Xu
On Tue, Jul 18, 2017 at 07:32:41AM +0100, Ard Biesheuvel wrote:
>
> Because it is slower, and how much slower is architecture dependent
> (if your arch has slow multiplication, aes-ti decryption will be dog
> slow compared to aes-generic)

Right, but does anybody actually care? My guess is that on such a
platform aes-generic is going to be dog-slow anyway.

> Also, quite a few architectures have table based implementations that
> reuse crypto_ft_tab/crypto_fl_tab etc so we'd need to factor out those
> into a separate module if we were to remove aes-generic.

You mean x86 and arm, right? Isn't the idea of your patch-set to allow
aes-generic to be disabled on x86/arm, no?

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH V2 6/6] crypto/nx: Add P9 NX support for 842 compression engine

2017-07-18 Thread Ram Pai
On Mon, Jul 17, 2017 at 04:50:38PM -0700, Haren Myneni wrote:
> 
> This patch adds P9 NX support for 842 compression engine. Virtual
> Accelerator Switchboard (VAS) is used to access 842 engine on P9.
> 
> For each NX engine per chip, setup receive window using
> vas_rx_win_open() which configures RxFIFo with FIFO address, lpid,
> pid and tid values. This unique (lpid, pid, tid) combination will
> be used to identify the target engine.
> 
> For crypto open request, open send window on the NX engine for
> the corresponding chip / cpu where the open request is executed.
> This send window will be closed upon crypto close request.
> 
> NX provides high and normal priority FIFOs. For compression /
> decompression requests, we use only hight priority FIFOs in kernel.
> 
> Each NX request will be communicated to VAS using copy/paste
> instructions with vas_copy_crb() / vas_paste_crb() functions.
> 
> Signed-off-by: Haren Myneni 
> ---
>  drivers/crypto/nx/Kconfig  |   1 +
>  drivers/crypto/nx/nx-842-powernv.c | 369 
> -
>  drivers/crypto/nx/nx-842.c |   2 +-
>  3 files changed, 365 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/crypto/nx/Kconfig b/drivers/crypto/nx/Kconfig
> index ad7552a6998c..cd5dda9c48f4 100644
> --- a/drivers/crypto/nx/Kconfig
> +++ b/drivers/crypto/nx/Kconfig
> @@ -38,6 +38,7 @@ config CRYPTO_DEV_NX_COMPRESS_PSERIES
>  config CRYPTO_DEV_NX_COMPRESS_POWERNV
>   tristate "Compression acceleration support on PowerNV platform"
>   depends on PPC_POWERNV
> + depends on PPC_VAS
>   default y
>   help
> Support for PowerPC Nest (NX) compression acceleration. This
> diff --git a/drivers/crypto/nx/nx-842-powernv.c 
> b/drivers/crypto/nx/nx-842-powernv.c
> index c0dd4c7e17d3..8d9d21420144 100644
> --- a/drivers/crypto/nx/nx-842-powernv.c
> +++ b/drivers/crypto/nx/nx-842-powernv.c
> @@ -23,6 +23,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  MODULE_LICENSE("GPL");
>  MODULE_AUTHOR("Dan Streetman ");
> @@ -32,6 +33,9 @@ MODULE_ALIAS_CRYPTO("842-nx");
> 
>  #define WORKMEM_ALIGN(CRB_ALIGN)
>  #define CSB_WAIT_MAX (5000) /* ms */
> +#define VAS_RETRIES  (10)
> +/* # of requests allowed per RxFIFO at a time. 0 for unlimited */
> +#define MAX_CREDITS_PER_RXFIFO   (64)
> 
>  struct nx842_workmem {
>   /* Below fields must be properly aligned */
> @@ -42,16 +46,27 @@ struct nx842_workmem {
> 
>   ktime_t start;
> 
> + struct vas_window *txwin;   /* Used with VAS function */
>   char padding[WORKMEM_ALIGN]; /* unused, to allow alignment */
>  } __packed __aligned(WORKMEM_ALIGN);
> 
>  struct nx842_coproc {
>   unsigned int chip_id;
>   unsigned int ct;
> - unsigned int ci;
> + unsigned int ci;/* Coprocessor instance, used with icswx */
> + struct {
> + struct vas_window *rxwin;
> + int id;
> + } vas;

ci and vas are mutually exclusive. a few bytes could be saved by unionizing 
them?

>   struct list_head list;
>  };
> 
> +/*
> + * Send the request to NX engine on the chip for the corresponding CPU
> + * where the process is executing. Use with VAS function.
> + */
> +static DEFINE_PER_CPU(struct nx842_coproc *, coproc_inst);
> +
>  /* no cpu hotplug on powernv, so this list never changes after init */
>  static LIST_HEAD(nx842_coprocs);
>  static unsigned int nx842_ct;/* used in icswx function */
> @@ -513,6 +528,108 @@ static int nx842_exec_icswx(const unsigned char *in, 
> unsigned int inlen,
>  }
> 
>  /**
> + * nx842_exec_vas - compress/decompress data using the 842 algorithm
> + *
> + * (De)compression provided by the NX842 coprocessor on IBM PowerNV systems.
> + * This compresses or decompresses the provided input buffer into the 
> provided
> + * output buffer.
> + *
> + * Upon return from this function @outlen contains the length of the
> + * output data.  If there is an error then @outlen will be 0 and an
> + * error will be specified by the return code from this function.
> + *
> + * The @workmem buffer should only be used by one function call at a time.
> + *
> + * @in: input buffer pointer
> + * @inlen: input buffer size
> + * @out: output buffer pointer
> + * @outlenp: output buffer size pointer
> + * @workmem: working memory buffer pointer, size determined by
> + *   nx842_powernv_driver.workmem_size
> + * @fc: function code, see CCW Function Codes in nx-842.h
> + *
> + * Returns:
> + *   0   Success, output of length @outlenp stored in the buffer
> + *   at @out
> + *   -ENODEV Hardware unavailable
> + *   -ENOSPC Output buffer is to small
> + *   -EMSGSIZE   Input buffer too large
> + *   -EINVAL buffer constraints do not fix nx842_constraints
> + *   -EPROTO hardware error during operation
> + *   -ETIMEDOUT  hardware did not complete operation in reasonable time
> + *   -EINTR  operation was aborted
> + */
> +static 

Re: [PATCH v3 0/7] crypto: aes - allow generic AES to be omitted

2017-07-18 Thread Ard Biesheuvel
On 18 July 2017 at 06:25, Herbert Xu  wrote:
> On Tue, Jun 20, 2017 at 11:28:53AM +0200, Ard Biesheuvel wrote:
>> The generic AES driver uses 16 lookup tables of 1 KB each, and has
>> encryption and decryption routines that are fully unrolled. Given how
>> the dependencies between this code and other drivers are declared in
>> Kconfig files, this code is always pulled into the core kernel, even
>> if it is usually superseded at runtime by accelerated drivers that
>> exist for many architectures.
>
> Why can't we simply replace aes-generic with aes-ti?
>

Because it is slower, and how much slower is architecture dependent
(if your arch has slow multiplication, aes-ti decryption will be dog
slow compared to aes-generic)

Also, quite a few architectures have table based implementations that
reuse crypto_ft_tab/crypto_fl_tab etc so we'd need to factor out those
into a separate module if we were to remove aes-generic.


Re: [PATCH] crypto: ccp - Fix XTS-AES support on a version 5 CCP

2017-07-18 Thread Stephan Müller
Am Montag, 17. Juli 2017, 22:08:27 CEST schrieb Gary R Hook:

Hi Gary,

> Version 5 CCPs have differing requirements for XTS-AES: key components
> are stored in a 512-bit vector. The context must be little-endian
> justified. AES-256 is supported now, so propagate the cipher size to
> the command descriptor.
> 
> Signed-off-by: Gary R Hook 
> ---
>  drivers/crypto/ccp/ccp-crypto-aes-xts.c |   79
> --- drivers/crypto/ccp/ccp-dev-v5.c |  
>  2 +
>  drivers/crypto/ccp/ccp-dev.h|2 +
>  drivers/crypto/ccp/ccp-ops.c|   56 ++
>  4 files changed, 89 insertions(+), 50 deletions(-)
> 
> diff --git a/drivers/crypto/ccp/ccp-crypto-aes-xts.c
> b/drivers/crypto/ccp/ccp-crypto-aes-xts.c index 58a4244b4752..8d248b198e22
> 100644
> --- a/drivers/crypto/ccp/ccp-crypto-aes-xts.c
> +++ b/drivers/crypto/ccp/ccp-crypto-aes-xts.c
> @@ -1,7 +1,7 @@
>  /*
>   * AMD Cryptographic Coprocessor (CCP) AES XTS crypto API support
>   *
> - * Copyright (C) 2013 Advanced Micro Devices, Inc.
> + * Copyright (C) 2013,2017 Advanced Micro Devices, Inc.
>   *
>   * Author: Tom Lendacky 
>   *
> @@ -37,46 +37,26 @@ struct ccp_unit_size_map {
>   u32 value;
>  };
> 
> -static struct ccp_unit_size_map unit_size_map[] = {
> +static struct ccp_unit_size_map xts_unit_sizes[] = {
>   {
> - .size   = 4096,
> - .value  = CCP_XTS_AES_UNIT_SIZE_4096,
> - },
> - {
> - .size   = 2048,
> - .value  = CCP_XTS_AES_UNIT_SIZE_2048,
> - },
> - {
> - .size   = 1024,
> - .value  = CCP_XTS_AES_UNIT_SIZE_1024,
> + .size   = 16,
> + .value  = CCP_XTS_AES_UNIT_SIZE_16,
>   },
>   {
> - .size   = 512,
> + .size   = 512,
>   .value  = CCP_XTS_AES_UNIT_SIZE_512,
>   },
>   {
> - .size   = 256,
> - .value  = CCP_XTS_AES_UNIT_SIZE__LAST,
> - },
> - {
> - .size   = 128,
> - .value  = CCP_XTS_AES_UNIT_SIZE__LAST,
> - },
> - {
> - .size   = 64,
> - .value  = CCP_XTS_AES_UNIT_SIZE__LAST,
> - },
> - {
> - .size   = 32,
> - .value  = CCP_XTS_AES_UNIT_SIZE__LAST,
> + .size   = 1024,
> + .value  = CCP_XTS_AES_UNIT_SIZE_1024,
>   },
>   {
> - .size   = 16,
> - .value  = CCP_XTS_AES_UNIT_SIZE_16,
> + .size   = 2048,
> + .value  = CCP_XTS_AES_UNIT_SIZE_2048,
>   },
>   {
> - .size   = 1,
> - .value  = CCP_XTS_AES_UNIT_SIZE__LAST,
> + .size   = 4096,
> + .value  = CCP_XTS_AES_UNIT_SIZE_4096,
>   },
>  };
> 
> @@ -97,14 +77,20 @@ static int ccp_aes_xts_setkey(struct crypto_ablkcipher
> *tfm, const u8 *key, unsigned int key_len)
>  {
>   struct ccp_ctx *ctx = crypto_tfm_ctx(crypto_ablkcipher_tfm(tfm));
> + unsigned int ccpversion = ccp_version();
> 
>   /* Only support 128-bit AES key with a 128-bit Tweak key,
>* otherwise use the fallback
>*/
> +

Can you please add xts_check_key here?

>   switch (key_len) {
>   case AES_KEYSIZE_128 * 2:
>   memcpy(ctx->u.aes.key, key, key_len);
>   break;
> + case AES_KEYSIZE_256 * 2:
> + if (ccpversion > CCP_VERSION(3, 0))
> + memcpy(ctx->u.aes.key, key, key_len);
> + break;
>   }
>   ctx->u.aes.key_len = key_len / 2;
>   sg_init_one(>u.aes.key_sg, ctx->u.aes.key, key_len);
> @@ -117,7 +103,10 @@ static int ccp_aes_xts_crypt(struct ablkcipher_request
> *req, {
>   struct ccp_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
>   struct ccp_aes_req_ctx *rctx = ablkcipher_request_ctx(req);
> + unsigned int ccpversion = ccp_version();
> + unsigned int fallback = 0;
>   unsigned int unit;
> + u32 block_size = 0;
>   u32 unit_size;
>   int ret;
> 
> @@ -131,17 +120,29 @@ static int ccp_aes_xts_crypt(struct ablkcipher_request
> *req, return -EINVAL;
> 
>   unit_size = CCP_XTS_AES_UNIT_SIZE__LAST;
> - if (req->nbytes <= unit_size_map[0].size) {
> - for (unit = 0; unit < ARRAY_SIZE(unit_size_map); unit++) {
> - if (!(req->nbytes & (unit_size_map[unit].size - 1))) {
> - unit_size = unit_size_map[unit].value;
> - break;
> - }
> + for (unit = ARRAY_SIZE(xts_unit_sizes); unit-- > 0; ) {
> + if (!(req->nbytes & (xts_unit_sizes[unit].size - 1))) {
> + unit_size = unit;
> + block_size = xts_unit_sizes[unit].size;
> + break;
>   }
>   }
> 
> - if ((unit_size == CCP_XTS_AES_UNIT_SIZE__LAST) ||
> - (ctx->u.aes.key_len != AES_KEYSIZE_128)) {
> + /* The CCP has 

Re: [PATCH V2 0/6] Enable NX 842 compression engine on Power9

2017-07-18 Thread Nicholas Piggin
On Mon, 17 Jul 2017 16:43:19 -0700
Haren Myneni  wrote:

> [PATCH V2 0/6] Enable NX 842 compression engine on Power9
> 
> P9 introduces Virtual Accelerator Switchboard (VAS) to communicate
> with NX 842 engine. icswx function is used to access NX before.
> On powerNV systems, NX-842 driver invokes VAS functions for
> configuring RxFIFO (receive window) per each NX engine. VAS uses
> this FIFO to communicate the request to NX. The kernel opens send
> window which is used to transfer compression/decompression requests
> to VAS. It maps the send window to the corresponding RxFIFO.
> copy/paste instructions are used to pass the CRB to VAS.
> 
> This patch series adds P9 NX support for 842 compression engine.
> First 4 patches reorganize the current code so that VAS function
> can be added.
> - nx842_powernv_function points to VAS function if VAS feature is
>   available. Otherwise icswx function is used.
> - Move configure CRB code nx842_cfg_crb() 
> - In addition to freeing co-processor structs for initialization 
>   failures and exit, both send and receive windows have to closed
>   for VAS.
> - Move updating coprocessor info list to nx842_add_coprocs_list().
> 
> The last 2 patches adds configuring and invoking VAS, and also
> checking P9 NX specific errors that are provided in co-processor
> status block (CSB) for failures.
> 
> Patches have been tested on P9 DD1 system using VAS changes and
> on P8 HW to make sure no regression.
> 
> This patchset depends on VAS kernel changes:
> https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-May/158178.html

Just a question, we no longer invalidate the copy buffer on context
switch after this patch:

07d2a628bc ("powerpc/64s: Avoid cpabort in context switch when possible")

If your vas address mappings are visible only to kernel, only used in
process / kthread context, and only used with kernel preemption disabled,
this is okay.

If userspace can possibly copy/paste to the mappings or if you need to
sleep or call this from interrupt context, we need to work out how to
invalidate the copy buffer.

Thanks,
Nick


<    1   2