Re: BUG: unable to handle kernel paging request in hmac_init_tfm

2017-12-20 Thread Dmitry Vyukov
On Thu, Dec 21, 2017 at 12:09 AM, Eric Biggers  wrote:
> On Mon, Dec 18, 2017 at 11:36:01AM -0800, syzbot wrote:
>> Hello,
>>
>> syzkaller hit the following crash on
>> 6084b576dca2e898f5c101baef151f7bfdbb606d
>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
>> compiler: gcc (GCC) 7.1.1 20170620
>> .config is attached
>
> FYI, in linux-next KASAN and other memory debugging options are now behind
> CONFIG_DEBUG_MEMORY.  So, I think KASAN isn't getting turned on anymore, 
> despite
> the CONFIG_KASAN=y.  (Patch was "lib/: make "Memory Debugging" a menuconfig to
> ease disabling it all".)

Ouch! That would be pretty bad.

But I've tried both linux-next HEAD at:

commit 0e08c463db387a2adcb0243b15ab868a73f87807 (HEAD, tag:
next-20171221, linux-next/master)
Author: Stephen Rothwell 
Date:   Thu Dec 21 15:37:39 2017 +1100
Add linux-next specific files for 20171221

and mmots HEAD at:

commit 75aa5540627fdb3d8f86229776ea87f995275351 (HEAD, tag:
v4.15-rc4-mmots-2017-12-20-19-10, mmots/master)
Author: Linus Torvalds 
Date:   Thu Dec 21 04:02:17 2017 +
pci: test for unexpectedly disabled bridges

and after running make olddefconfig I still see CONFIG_KASAN=y in
.config, nor I can find CONFIG_DEBUG_MEMORY in menuconfig.

What am I missing? What commit has added the config?


Re: [PATCH 1/2] padata: Remove FSF address

2017-12-20 Thread Philippe Ombredanne
Dear CheahKC,

On Wed, Dec 20, 2017 at 10:17 PM, Cheah Kok Cheong  wrote:
> On Wed, Dec 20, 2017 at 09:20:48PM +0100, Philippe Ombredanne wrote:
>> On Wed, Dec 20, 2017 at 9:15 PM, Cheah Kok Cheong  wrote:
>> > Remove FSF address otherwise checkpatch will flag my next patch.
>> >
>> > Signed-off-by: Cheah Kok Cheong 
>> > ---
>> >  kernel/padata.c | 4 
>> >  1 file changed, 4 deletions(-)
>> >
>> > diff --git a/kernel/padata.c b/kernel/padata.c
>> > index 57c0074..9d91909 100644
>> > --- a/kernel/padata.c
>> > +++ b/kernel/padata.c
>> > @@ -14,10 +14,6 @@
>> >   * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> >   * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License 
>> > for
>> >   * more details.
>> > - *
>> > - * You should have received a copy of the GNU General Public License 
>> > along with
>> > - * this program; if not, write to the Free Software Foundation, Inc.,
>> > - * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
>> >   */
>> >
>> >  #include 
>> > --
>> > 2.7.4
>> >
>>
>>
>> Why not use instead the simpler SPDX ids?
>> --
>> Cordially
>> Philippe Ombredanne
>
> Hi Philippe,
>   Adding the SPDX id can be the subject of a separate patch.
> Believe you are part of the team doing an audit in this matter.
> I shall leave adding the SPDX id to you guys the professionals.
> Looks like you guys are using script to do this en masse.
>
> But if you insist and our maintainer is agreeable, I can send
> another patch to add the SPDX id.

If you could work it out directly that would be best. There are a lot
of files in the kernel and each one needs a careful review even with
best license detection script (which . So tagging files on the go when
they are updated is the best way to help!

Thank you.

-- 
Cordially
Philippe Ombredanne


Re: [PATCH RFC 2/3] crypto: Implement a generic crypto statistics

2017-12-20 Thread Stephan Mueller
Am Mittwoch, 20. Dezember 2017, 21:09:26 CET schrieb Corentin Labbe:

Hi Corentin,

> This patch implement a generic way to get statistics about all crypto
> usages.
> 
> Signed-off-by: Corentin Labbe 
> ---
>  crypto/Kconfig |  11 +++
>  crypto/ahash.c |  18 +
>  crypto/algapi.c| 186
> + crypto/rng.c   | 
>  3 +
>  include/crypto/acompress.h |  10 +++
>  include/crypto/akcipher.h  |  12 +++
>  include/crypto/kpp.h   |   9 +++
>  include/crypto/rng.h   |   5 ++
>  include/crypto/skcipher.h  |   8 ++
>  include/linux/crypto.h |  22 ++
>  10 files changed, 284 insertions(+)
> 
> diff --git a/crypto/Kconfig b/crypto/Kconfig
> index d6e9b60fc063..69f1822a026b 100644
> --- a/crypto/Kconfig
> +++ b/crypto/Kconfig
> @@ -1781,6 +1781,17 @@ config CRYPTO_USER_API_AEAD
> This option enables the user-spaces interface for AEAD
> cipher algorithms.
> 
> +config CRYPTO_STATS
> + bool "Crypto usage statistics for User-space"
> + help
> +   This option enables the gathering of crypto stats.
> +   This will collect:
> +   - encrypt/decrypt size and numbers of symmeric operations
> +   - compress/decompress size and numbers of compress operations
> +   - size and numbers of hash operations
> +   - encrypt/decrypt/sign/verify numbers for asymmetric operations
> +   - generate/seed numbers for rng operations
> +
>  config CRYPTO_HASH_INFO
>   bool
> 
> diff --git a/crypto/ahash.c b/crypto/ahash.c
> index 3a35d67de7d9..93b56892f1b8 100644
> --- a/crypto/ahash.c
> +++ b/crypto/ahash.c
> @@ -356,18 +356,36 @@ static int crypto_ahash_op(struct ahash_request *req,
> 
>  int crypto_ahash_final(struct ahash_request *req)
>  {
> +#ifdef CONFIG_CRYPTO_STATS
> + struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +
> + tfm->base.__crt_alg->enc_cnt++;
> + tfm->base.__crt_alg->enc_tlen += req->nbytes;
> +#endif
>   return crypto_ahash_op(req, crypto_ahash_reqtfm(req)->final);
>  }
>  EXPORT_SYMBOL_GPL(crypto_ahash_final);
> 
>  int crypto_ahash_finup(struct ahash_request *req)
>  {
> +#ifdef CONFIG_CRYPTO_STATS
> + struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +
> + tfm->base.__crt_alg->enc_cnt++;
> + tfm->base.__crt_alg->enc_tlen += req->nbytes;
> +#endif
>   return crypto_ahash_op(req, crypto_ahash_reqtfm(req)->finup);
>  }
>  EXPORT_SYMBOL_GPL(crypto_ahash_finup);
> 
>  int crypto_ahash_digest(struct ahash_request *req)
>  {
> +#ifdef CONFIG_CRYPTO_STATS
> + struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
> +
> + tfm->base.__crt_alg->enc_cnt++;
> + tfm->base.__crt_alg->enc_tlen += req->nbytes;
> +#endif

This is ugly. May I ask that one inlne function is made that has these ifdefs 
instead of adding them throughout multiple functions? This comment would apply 
to the other instrumentation code below as well. The issue is that these 
ifdefs should not be spread around through the code.

Besides, I would think that the use of normal integers is not thread-safe. 
What about using atomic_t?

>   return crypto_ahash_op(req, crypto_ahash_reqtfm(req)->digest);
>  }
>  EXPORT_SYMBOL_GPL(crypto_ahash_digest);
> diff --git a/crypto/algapi.c b/crypto/algapi.c
> index b8f6122f37e9..4fca4576af78 100644
> --- a/crypto/algapi.c
> +++ b/crypto/algapi.c
> @@ -20,11 +20,158 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  #include "internal.h"
> 
>  static LIST_HEAD(crypto_template_list);
> 
> +#ifdef CONFIG_CRYPTO_STATS

Instead of ifdefing in the code, may I suggest adding a new file that is 
compiled / not compiled via the Makefile?

> +static struct kobject *crypto_root_kobj;
> +
> +static struct kobj_type dynamic_kobj_ktype = {
> + .sysfs_ops  = &kobj_sysfs_ops,
> +};
> +
> +static ssize_t fcrypto_priority(struct kobject *kobj,
> + struct kobj_attribute *attr, char *buf)
> +{
> + struct crypto_alg *alg;
> +
> + alg = container_of(kobj, struct crypto_alg, cra_stat_obj);
> + return snprintf(buf, 9, "%d\n", alg->cra_priority);
> +}
> +
> +static ssize_t fcrypto_stat_enc_cnt(struct kobject *kobj,
> + struct kobj_attribute *attr, char *buf)
> +{
> + struct crypto_alg *alg;
> +
> + alg = container_of(kobj, struct crypto_alg, cra_stat_obj);
> + return snprintf(buf, 9, "%lu\n", alg->enc_cnt);
> +}
> +
> +static ssize_t fcrypto_stat_enc_tlen(struct kobject *kobj,
> +  struct kobj_attribute *attr, char *buf)
> +{
> + struct crypto_alg *alg;
> +
> + alg = container_of(kobj, struct crypto_alg, cra_stat_obj);
> + return snprintf(buf, 9, "%lu\n", alg->enc_tlen);
> +}
> +
> +static ssize_t fcrypto_stat_dec_cnt(struct kobject *kobj,
> + struct kobj_attribute *attr, char *buf)
> +{
> + struct crypto_alg *alg;
> +
> + alg = container_of(kobj, struct cr

Re: [PATCH RFC 1/3] crypto: Prevent to register duplicate cra_driver_name

2017-12-20 Thread Herbert Xu
On Wed, Dec 20, 2017 at 08:09:25PM +, Corentin Labbe wrote:
> Each crypto algorithm "cra_name" can have multiple implementation called
> "cra_driver_name".
> If two different implementation have the same cra_driver_name, nothing
> can easily differentiate them.
> Furthermore the mechanism for getting a crypto algorithm with its
> implementation name (crypto_alg_match() in crypto/crypto_user.c) will
> get only the first one found.
> 
> So this patch prevent the registration of two implementation with the
> same cra_driver_name.
> 
> Signed-off-by: Corentin Labbe 

No this is intentional.  The idea is that you can hot-replace
an implementation by registering a new version of it while the
old one is still in use.  The new one will be used for all new
allocations.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH RFC 1/3] crypto: Prevent to register duplicate cra_driver_name

2017-12-20 Thread Stephan Mueller
Am Mittwoch, 20. Dezember 2017, 21:09:25 CET schrieb Corentin Labbe:

Hi Corentin,

> Each crypto algorithm "cra_name" can have multiple implementation called
> "cra_driver_name".
> If two different implementation have the same cra_driver_name, nothing
> can easily differentiate them.
> Furthermore the mechanism for getting a crypto algorithm with its
> implementation name (crypto_alg_match() in crypto/crypto_user.c) will
> get only the first one found.
> 
> So this patch prevent the registration of two implementation with the
> same cra_driver_name.

Have you seen that happen? The driver_name should be an unambiguous name that 
is unique throughout all crypto implementations.

If you have seen a duplication, then this duplication should be fixed.

Ciao
Stephan


Re: [PATCH v3 1/2] crypto: Fix out-of-bounds access of the data buffer in generic-gcm-aesni

2017-12-20 Thread Junaid Shahid
On Wednesday, December 20, 2017 5:08:37 PM PST Junaid Shahid wrote:
> +.macro READ_PARTIAL_BLOCK DPTR DLEN XMM1 XMMDst
> +cmp $8, \DLEN
> +jl _read_lt8_\@

> +mov (\DPTR), %rax
> +MOVQ_R64_XMM %rax, \XMMDst
Just noticed that these two can be replaced with:
+movq (\DPTR), \XMMDst

> +sub $8, \DLEN
> +jz _done_read_partial_block_\@
> + xor %eax, %eax
> +_read_next_byte_\@:



[PATCH v3 2/2] crypto: Fix out-of-bounds access of the AAD buffer in generic-gcm-aesni

2017-12-20 Thread Junaid Shahid
The aesni_gcm_enc/dec functions can access memory after the end of
the AAD buffer if the AAD length is not a multiple of 4 bytes.
It didn't matter with rfc4106-gcm-aesni as in that case the AAD was
always followed by the 8 byte IV, but that is no longer the case with
generic-gcm-aesni. This can potentially result in accessing a page that
is not mapped and thus causing the machine to crash. This patch fixes
that by reading the last <16 byte block of the AAD byte-by-byte and
optionally via an 8-byte load if the block was at least 8 bytes.

Fixes: 0487ccac ("crypto: aesni - make non-AVX AES-GCM work with any aadlen")
Signed-off-by: Junaid Shahid 
---
 arch/x86/crypto/aesni-intel_asm.S | 112 --
 1 file changed, 12 insertions(+), 100 deletions(-)

diff --git a/arch/x86/crypto/aesni-intel_asm.S 
b/arch/x86/crypto/aesni-intel_asm.S
index c36b850fdc81..76d8cd426a31 100644
--- a/arch/x86/crypto/aesni-intel_asm.S
+++ b/arch/x86/crypto/aesni-intel_asm.S
@@ -89,30 +89,6 @@ SHIFT_MASK: .octa 0x0f0e0d0c0b0a09080706050403020100
 ALL_F:  .octa 0x
 .octa 0x
 
-.section .rodata
-.align 16
-.type aad_shift_arr, @object
-.size aad_shift_arr, 272
-aad_shift_arr:
-.octa 0x
-.octa 0xff0C
-.octa 0x0D0C
-.octa 0xff0E0D0C
-.octa 0x0F0E0D0C
-.octa 0xff0C0B0A0908
-.octa 0x0D0C0B0A0908
-.octa 0xff0E0D0C0B0A0908
-.octa 0x0F0E0D0C0B0A0908
-.octa 0xff0C0B0A090807060504
-.octa 0x0D0C0B0A090807060504
-.octa 0xff0E0D0C0B0A090807060504
-.octa 0x0F0E0D0C0B0A090807060504
-.octa 0xff0C0B0A09080706050403020100
-.octa 0x0D0C0B0A09080706050403020100
-.octa 0xff0E0D0C0B0A09080706050403020100
-.octa 0x0F0E0D0C0B0A09080706050403020100
-
-
 .text
 
 
@@ -303,62 +279,30 @@ _done_read_partial_block_\@:
 XMM2 XMM3 XMM4 XMMDst TMP6 TMP7 i i_seq operation
 MOVADQ SHUF_MASK(%rip), %xmm14
movarg7, %r10   # %r10 = AAD
-   movarg8, %r12   # %r12 = aadLen
-   mov%r12, %r11
+   movarg8, %r11   # %r11 = aadLen
pxor   %xmm\i, %xmm\i
pxor   \XMM2, \XMM2
 
cmp$16, %r11
-   jl _get_AAD_rest8\num_initial_blocks\operation
+   jl _get_AAD_rest\num_initial_blocks\operation
 _get_AAD_blocks\num_initial_blocks\operation:
movdqu (%r10), %xmm\i
PSHUFB_XMM %xmm14, %xmm\i # byte-reflect the AAD data
pxor   %xmm\i, \XMM2
GHASH_MUL  \XMM2, \TMP3, \TMP1, \TMP2, \TMP4, \TMP5, \XMM1
add$16, %r10
-   sub$16, %r12
sub$16, %r11
cmp$16, %r11
jge_get_AAD_blocks\num_initial_blocks\operation
 
movdqu \XMM2, %xmm\i
+
+   /* read the last <16B of AAD */
+_get_AAD_rest\num_initial_blocks\operation:
cmp$0, %r11
je _get_AAD_done\num_initial_blocks\operation
 
-   pxor   %xmm\i,%xmm\i
-
-   /* read the last <16B of AAD. since we have at least 4B of
-   data right after the AAD (the ICV, and maybe some CT), we can
-   read 4B/8B blocks safely, and then get rid of the extra stuff */
-_get_AAD_rest8\num_initial_blocks\operation:
-   cmp$4, %r11
-   jle_get_AAD_rest4\num_initial_blocks\operation
-   movq   (%r10), \TMP1
-   add$8, %r10
-   sub$8, %r11
-   pslldq $8, \TMP1
-   psrldq $8, %xmm\i
-   pxor   \TMP1, %xmm\i
-   jmp_get_AAD_rest8\num_initial_blocks\operation
-_get_AAD_rest4\num_initial_blocks\operation:
-   cmp$0, %r11
-   jle_get_AAD_rest0\num_initial_blocks\operation
-   mov(%r10), %eax
-   movq   %rax, \TMP1
-   add$4, %r10
-   sub$4, %r10
-   pslldq $12, \TMP1
-   psrldq $4, %xmm\i
-   pxor   \TMP1, %xmm\i
-_get_AAD_rest0\num_initial_blocks\operation:
-   /* finalize: shift out the extra bytes we read, and align
-   left. since pslldq can only shift by an immediate, we use
-   vpshufb and an array of shuffle masks */
-   movq   %r12, %r11
-   salq   $4, %r11
-   movdqu aad_shift_arr(%r11), \TMP1
-   PSHUFB_XMM \TMP1, %xmm\i
-_get_AAD_rest_final\num_initial_blocks\operation:
+   READ_PARTIAL_BLOCK %r10, %r11, \TMP1, %xmm\i
PSHUFB_XMM   %xmm14, %xmm\i # byte-reflect the AAD data
pxor   \XMM2, %xmm\i
GHASH_MUL  %xmm\i, \TMP3, \TMP1, \TMP2, \T

[PATCH v3 1/2] crypto: Fix out-of-bounds access of the data buffer in generic-gcm-aesni

2017-12-20 Thread Junaid Shahid
The aesni_gcm_enc/dec functions can access memory before the start of
the data buffer if the length of the data buffer is less than 16 bytes.
This is because they perform the read via a single 16-byte load. This
can potentially result in accessing a page that is not mapped and thus
causing the machine to crash. This patch fixes that by reading the
partial block byte-by-byte and optionally an via 8-byte load if the block
was at least 8 bytes.

Fixes: 0487ccac ("crypto: aesni - make non-AVX AES-GCM work with any aadlen")
Signed-off-by: Junaid Shahid 
---
 arch/x86/crypto/aesni-intel_asm.S | 87 ---
 1 file changed, 45 insertions(+), 42 deletions(-)

diff --git a/arch/x86/crypto/aesni-intel_asm.S 
b/arch/x86/crypto/aesni-intel_asm.S
index 16627fec80b2..c36b850fdc81 100644
--- a/arch/x86/crypto/aesni-intel_asm.S
+++ b/arch/x86/crypto/aesni-intel_asm.S
@@ -256,6 +256,37 @@ aad_shift_arr:
pxor  \TMP1, \GH# result is in TMP1
 .endm
 
+# Reads DLEN bytes starting at DPTR and stores in XMMDst
+# where 0 < DLEN < 16
+# Clobbers %rax, DLEN and XMM1
+.macro READ_PARTIAL_BLOCK DPTR DLEN XMM1 XMMDst
+cmp $8, \DLEN
+jl _read_lt8_\@
+mov (\DPTR), %rax
+MOVQ_R64_XMM %rax, \XMMDst
+sub $8, \DLEN
+jz _done_read_partial_block_\@
+   xor %eax, %eax
+_read_next_byte_\@:
+shl $8, %rax
+mov 7(\DPTR, \DLEN, 1), %al
+dec \DLEN
+jnz _read_next_byte_\@
+MOVQ_R64_XMM %rax, \XMM1
+   pslldq $8, \XMM1
+por \XMM1, \XMMDst
+   jmp _done_read_partial_block_\@
+_read_lt8_\@:
+   xor %eax, %eax
+_read_next_byte_lt8_\@:
+shl $8, %rax
+mov -1(\DPTR, \DLEN, 1), %al
+dec \DLEN
+jnz _read_next_byte_lt8_\@
+MOVQ_R64_XMM %rax, \XMMDst
+_done_read_partial_block_\@:
+.endm
+
 /*
 * if a = number of total plaintext bytes
 * b = floor(a/16)
@@ -1385,14 +1416,6 @@ _esb_loop_\@:
 *
 *AAD Format with 64-bit Extended Sequence Number
 *
-* aadLen:
-*   from the definition of the spec, aadLen can only be 8 or 12 bytes.
-*   The code supports 16 too but for other sizes, the code will fail.
-*
-* TLen:
-*   from the definition of the spec, TLen can only be 8, 12 or 16 bytes.
-*   For other sizes, the code will fail.
-*
 * poly = x^128 + x^127 + x^126 + x^121 + 1
 *
 */
@@ -1486,19 +1509,16 @@ _zero_cipher_left_decrypt:
PSHUFB_XMM %xmm10, %xmm0
 
ENCRYPT_SINGLE_BLOCK  %xmm0, %xmm1# E(K, Yn)
-   sub $16, %r11
-   add %r13, %r11
-   movdqu (%arg3,%r11,1), %xmm1   # receive the last <16 byte block
-   lea SHIFT_MASK+16(%rip), %r12
-   sub %r13, %r12
-# adjust the shuffle mask pointer to be able to shift 16-%r13 bytes
-# (%r13 is the number of bytes in plaintext mod 16)
-   movdqu (%r12), %xmm2   # get the appropriate shuffle mask
-   PSHUFB_XMM %xmm2, %xmm1# right shift 16-%r13 butes
 
+   lea (%arg3,%r11,1), %r10
+   mov %r13, %r12
+   READ_PARTIAL_BLOCK %r10 %r12 %xmm2 %xmm1
+
+   lea ALL_F+16(%rip), %r12
+   sub %r13, %r12
movdqa  %xmm1, %xmm2
pxor %xmm1, %xmm0# Ciphertext XOR E(K, Yn)
-   movdqu ALL_F-SHIFT_MASK(%r12), %xmm1
+   movdqu (%r12), %xmm1
# get the appropriate mask to mask out top 16-%r13 bytes of %xmm0
pand %xmm1, %xmm0# mask out top 16-%r13 bytes of %xmm0
pand%xmm1, %xmm2
@@ -1507,9 +1527,6 @@ _zero_cipher_left_decrypt:
 
pxor %xmm2, %xmm8
GHASH_MUL %xmm8, %xmm13, %xmm9, %xmm10, %xmm11, %xmm5, %xmm6
- # GHASH computation for the last <16 byte block
-   sub %r13, %r11
-   add $16, %r11
 
 # output %r13 bytes
MOVQ_R64_XMM%xmm0, %rax
@@ -1663,14 +1680,6 @@ ENDPROC(aesni_gcm_dec)
 *
 * AAD Format with 64-bit Extended Sequence Number
 *
-* aadLen:
-*   from the definition of the spec, aadLen can only be 8 or 12 bytes.
-*   The code supports 16 too but for other sizes, the code will fail.
-*
-* TLen:
-*   from the definition of the spec, TLen can only be 8, 12 or 16 bytes.
-*   For other sizes, the code will fail.
-*
 * poly = x^128 + x^127 + x^126 + x^121 + 1
 ***/
 ENTRY(aesni_gcm_enc)
@@ -1763,19 +1772,16 @@ _zero_cipher_left_encrypt:
 movdqa SHUF_MASK(%rip), %xmm10
PSHUFB_XMM %xmm10, %xmm0
 
-
ENCRYPT_SINGLE_BLOCK%xmm0, %xmm1# Encrypt(K, Yn)
-   sub $16, %r11
-   add %r13, %r11
-   movdqu (%arg3,%r11,1), %xmm1 # receive the last <16 byte blocks
-   lea SHIFT_MASK+16(%rip), %r12
+
+   lea (%arg3,%r11,1), %r10
+   mov %r13, %r12
+   READ_PARTIAL_BLOCK %r10 %r12 %xmm2 %xmm1
+
+   lea ALL_F+16(%rip), %r12
sub %r13, %r12
-   

[PATCH v3 0/2] Fix out-of-bounds memory accesses in generic-gcm-aesni

2017-12-20 Thread Junaid Shahid
Changelog:
v3:
- Fixed a bug in READ_PARTIAL_BLOCK when used for reading the AAD
- Some refactoring per CR feedback
v2:
- Also fixed issue 2 described below
v1:
- Fixed issue 1 described below

The aesni_gcm_enc/dec functions can access memory before the start or end of
the supplied src buffer. This can happen if either:

1. The data length is less than 16 bytes and there is no AAD or the AAD
   length is not enough to cover the underrun. In this case, memory before
   the start of the buffer would be accessed.
2. The AAD length is not a multiple of 4 bytes and the data length is too
   small to cover the overrun. In this case, memory after the end of the
   buffer would be accessed.

This was not a problem when rfc4106-gcm-aesni was the only mode supported by
the aesni module, as in that case there is always enough AAD and IV bytes to
cover the out-of-bounds accesses. However, that is no longer the case with
the generic-gcm-aesni mode. This could potentially result in accessing pages
that are not mapped, thus causing a crash.


Junaid Shahid (2):
  crypto: Fix out-of-bounds access of the data buffer in
generic-gcm-aesni
  crypto: Fix out-of-bounds access of the AAD buffer in
generic-gcm-aesni

 arch/x86/crypto/aesni-intel_asm.S | 199 +++---
 1 file changed, 57 insertions(+), 142 deletions(-)

-- 
2.15.1.620.gb9897f4670-goog



Re: BUG: unable to handle kernel paging request in hmac_init_tfm

2017-12-20 Thread Eric Biggers
On Mon, Dec 18, 2017 at 11:36:01AM -0800, syzbot wrote:
> Hello,
> 
> syzkaller hit the following crash on
> 6084b576dca2e898f5c101baef151f7bfdbb606d
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
> 
> Unfortunately, I don't have any reproducer for this bug yet.
> 
> 
> BUG: unable to handle kernel paging request at 8802dca4f748
> IP: hmac_init_tfm+0x5d/0x90 crypto/hmac.c:169
> PGD 404e067 P4D 404e067 PUD 0
> Oops: 0002 [#1] SMP
> Dumping ftrace buffer:
>(ftrace buffer empty)
> Modules linked in:
> CPU: 1 PID: 6418 Comm: syz-executor5 Not tainted
> 4.15.0-rc3-next-20171214+ #67
> Hardware name: Google Google Compute Engine/Google Compute Engine,
> BIOS Google 01/01/2011
> RIP: 0010:hmac_init_tfm+0x5d/0x90 crypto/hmac.c:169
> RSP: 0018:c9f77df8 EFLAGS: 00010202
> RAX: 0068 RBX: 8801dca52308 RCX: 816847d3
> RDX: 05d6 RSI: c90004d7e000 RDI: 8802143cc708
> RBP: c9f77e10 R08: 0002bcf8 R09: 8802143cc700
> R10:  R11:  R12: 8802143cc700
> R13: 8802dca4f748 R14: 8801e0a96c50 R15: c9f77ed0
> FS:  7fecf7c1a700() GS:88021fd0() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 8802dca4f748 CR3: 0001e47f4000 CR4: 001406e0
> DR0: 2008 DR1: 2008 DR2: 
> DR3:  DR6: fffe0ff0 DR7: 0600
> Call Trace:
>  crypto_create_tfm+0xb4/0x120 crypto/api.c:466
>  crypto_alloc_tfm+0x82/0x180 crypto/api.c:540
>  crypto_alloc_shash+0x2c/0x40 crypto/shash.c:436
>  sctp_listen_start net/sctp/socket.c:7496 [inline]
>  sctp_inet_listen+0x1c1/0x240 net/sctp/socket.c:7584
>  SYSC_listen net/socket.c:1483 [inline]
>  SyS_listen+0x71/0xb0 net/socket.c:1469
>  entry_SYSCALL_64_fastpath+0x1f/0x96
> RIP: 0033:0x452a39
> RSP: 002b:7fecf7c19c58 EFLAGS: 0212 ORIG_RAX: 0032
> RAX: ffda RBX: 00758020 RCX: 00452a39
> RDX:  RSI: 0ae0 RDI: 001b
> RBP: 00e8 R08:  R09: 
> R10:  R11: 0212 R12: 006ef660
> R13:  R14: 7fecf7c1a6d4 R15: 
> Code: ff 01 c0 4c 8d 6c 02 07 e8 a1 22 ff ff 49 83 e5 f8 48 3d 00 f0
> ff ff 49 89 c4 77 25 e8 6d 5b c3 ff 41 8b 04 24 83 c0 10 89 43 f8
> <4d> 89 65 00 45 31 e4 e8 57 5b c3 ff 44 89 e0 5b 41 5c 41 5d 5d
> RIP: hmac_init_tfm+0x5d/0x90 crypto/hmac.c:169 RSP: c9f77df8
> CR2: 8802dca4f748
> ---[ end trace ec6d3df7509d0a3b ]---
> Kernel panic - not syncing: Fatal exception
> Dumping ftrace buffer:
>(ftrace buffer empty)
> Kernel Offset: disabled
> Rebooting in 86400 seconds..

The crash was actually on the line 'ctx->hash = hash;' in hmac_init_tfm():

static int hmac_init_tfm(struct crypto_tfm *tfm)
{
struct crypto_shash *parent = __crypto_shash_cast(tfm);
struct crypto_shash *hash;
struct crypto_instance *inst = (void *)tfm->__crt_alg;
struct crypto_shash_spawn *spawn = crypto_instance_ctx(inst);
struct hmac_ctx *ctx = hmac_ctx(parent);

hash = crypto_spawn_shash(spawn);
if (IS_ERR(hash))
return PTR_ERR(hash);

parent->descsize = sizeof(struct shash_desc) +
   crypto_shash_descsize(hash);

ctx->hash = hash;
return 0;
}

'parent' was 8801dca52308 in RBX while 'ctx' was 8802dca4f748 in R13.
They are supposed to be almost the same but were actually quite different, so I
think the ->statesize and/or the ->cra_alignmask in the shash_alg got corrupted,
causing hmac_ctx() to return the wrong value.  I am guessing use-after-free, so
the report would have been more useful with KASAN enabled.

Eric


Re: [PATCH RFC 2/3] crypto: Implement a generic crypto statistics

2017-12-20 Thread Randy Dunlap
On 12/20/2017 12:09 PM, Corentin Labbe wrote:
> This patch implement a generic way to get statistics about all crypto
> usages.
> 
> Signed-off-by: Corentin Labbe 
> ---
>  crypto/Kconfig |  11 +++
>  crypto/ahash.c |  18 +
>  crypto/algapi.c| 186 
> +
>  crypto/rng.c   |   3 +
>  include/crypto/acompress.h |  10 +++
>  include/crypto/akcipher.h  |  12 +++
>  include/crypto/kpp.h   |   9 +++
>  include/crypto/rng.h   |   5 ++
>  include/crypto/skcipher.h  |   8 ++
>  include/linux/crypto.h |  22 ++
>  10 files changed, 284 insertions(+)
> 
> diff --git a/crypto/Kconfig b/crypto/Kconfig
> index d6e9b60fc063..69f1822a026b 100644
> --- a/crypto/Kconfig
> +++ b/crypto/Kconfig
> @@ -1781,6 +1781,17 @@ config CRYPTO_USER_API_AEAD
> This option enables the user-spaces interface for AEAD
> cipher algorithms.
>  
> +config CRYPTO_STATS
> + bool "Crypto usage statistics for User-space"
> + help
> +   This option enables the gathering of crypto stats.
> +   This will collect:
> +   - encrypt/decrypt size and numbers of symmeric operations

symmetric

> +   - compress/decompress size and numbers of compress operations
> +   - size and numbers of hash operations
> +   - encrypt/decrypt/sign/verify numbers for asymmetric operations
> +   - generate/seed numbers for rng operations
> +
>  config CRYPTO_HASH_INFO
>   bool
>  

> diff --git a/crypto/algapi.c b/crypto/algapi.c
> index b8f6122f37e9..4fca4576af78 100644
> --- a/crypto/algapi.c
> +++ b/crypto/algapi.c
> @@ -20,11 +20,158 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "internal.h"
>  

> +static ssize_t fcrypto_stat_type(struct kobject *kobj,
> +  struct kobj_attribute *attr, char *buf)
> +{
> + struct crypto_alg *alg;
> + u32 type;
> +
> + alg = container_of(kobj, struct crypto_alg, cra_stat_obj);
> + type = (alg->cra_flags & CRYPTO_ALG_TYPE_MASK);
> + if (type == CRYPTO_ALG_TYPE_ABLKCIPHER ||
> + type == CRYPTO_ALG_TYPE_SKCIPHER ||
> + type == CRYPTO_ALG_TYPE_CIPHER ||
> + type == CRYPTO_ALG_TYPE_BLKCIPHER
> + )
> + return snprintf(buf, 9, "cipher\n");
> + if (type == CRYPTO_ALG_TYPE_AHASH ||
> + type == CRYPTO_ALG_TYPE_HASH
> + )
> + return snprintf(buf, 9, "hash\n");
> + if (type == CRYPTO_ALG_TYPE_COMPRESS ||
> + type == CRYPTO_ALG_TYPE_SCOMPRESS)
> + return snprintf(buf, 11, "compress\n");
> + if (type == CRYPTO_ALG_TYPE_RNG)
> + return snprintf(buf, 9, "rng\n");
> + if (type == CRYPTO_ALG_TYPE_AKCIPHER)
> + return snprintf(buf, 11, "asymetric\n");

  asymmetric

> + if (type == CRYPTO_ALG_TYPE_KPP)
> + return snprintf(buf, 4, "kpp\n");
> + return snprintf(buf, 16, "unknown %x\n", type);
> +}


-- 
~Randy


Re: [PATCH RFC 3/3] crypto: tools: Add cryptostat userspace

2017-12-20 Thread Randy Dunlap
On 12/20/2017 12:09 PM, Corentin Labbe wrote:
> Add an example tool for getting easily crypto statistics.
> 
> Signed-off-by: Corentin Labbe 
> ---
>  tools/crypto/cryptostat | 40 
>  1 file changed, 40 insertions(+)
>  create mode 100755 tools/crypto/cryptostat
> 
> diff --git a/tools/crypto/cryptostat b/tools/crypto/cryptostat
> new file mode 100755
> index ..314e108eb608
> --- /dev/null
> +++ b/tools/crypto/cryptostat
> @@ -0,0 +1,40 @@
> +#!/bin/bash
> +
> +for dn in `ls /sys/kernel/crypto/`
> +do
> + dnpath=/sys/kernel/crypto/$dn/
> + algtype=$(cat $dnpath/algtype)
> + algname=$(cat $dnpath/algname)
> + echo "$dn ($algtype) $algname"
> + case $algtype in
> + hash)
> + echo -e "\tHash: $(cat $dnpath/enc_cnt)\tbytes $(cat $dnpath/enc_tlen)"
> + ;;
> + cipher)
> + echo -e "\tEncrypt: $(cat $dnpath/enc_cnt)\tbytes $(cat 
> $dnpath/enc_tlen)"
> + echo -e "\tDecrypt: $(cat $dnpath/dec_cnt)\tbytes $(cat 
> $dnpath/dec_tlen)"
> + ;;
> + compress)
> + echo -e "\tCompress: $(cat $dnpath/enc_cnt)\tbytes $(cat 
> $dnpath/enc_tlen)"
> + echo -e "\tDecompress: $(cat $dnpath/dec_cnt)\tbytes $(cat 
> $dnpath/dec_tlen)"
> + ;;
> + asymetric)

spello:
asymmetric)

> + echo -e "\tEncrypt: $(cat $dnpath/enc_cnt)\tbytes $(cat 
> $dnpath/enc_tlen)"
> + echo -e "\tDecrypt: $(cat $dnpath/dec_cnt)\tbytes $(cat 
> $dnpath/dec_tlen)"
> + echo -e "\tVerify: $(cat $dnpath/verify_cnt)"
> + echo -e "\tSign: $(cat $dnpath/sign_cnt)"
> + ;;
> + rng)
> + echo -e "\tGenerate: $(cat $dnpath/enc_cnt)\tbytes $(cat 
> $dnpath/enc_tlen)"
> + echo -e "\tSeed: $(cat $dnpath/dec_cnt)"
> + ;;
> + kpp)
> + echo -e "\tSet_secret: $(cat $dnpath/dec_cnt)"
> + echo -e "\tGenerate_public_key: $(cat $dnpath/enc_cnt)"
> + echo -e "\tCompute_shared_secret: $(cat $dnpath/verify_cnt)"
> + ;;
> + *)
> + echo -e "\t$(cat $dnpath/enc_cnt)\tbytes $(cat $dnpath/enc_tlen)"
> + ;;
> + esac
> +done
> 


-- 
~Randy


Re: BUG: unable to handle kernel paging request in hmac_init_tfm

2017-12-20 Thread Eric Biggers
On Mon, Dec 18, 2017 at 11:36:01AM -0800, syzbot wrote:
> Hello,
> 
> syzkaller hit the following crash on
> 6084b576dca2e898f5c101baef151f7bfdbb606d
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached

FYI, in linux-next KASAN and other memory debugging options are now behind
CONFIG_DEBUG_MEMORY.  So, I think KASAN isn't getting turned on anymore, despite
the CONFIG_KASAN=y.  (Patch was "lib/: make "Memory Debugging" a menuconfig to
ease disabling it all".)

Eric


Re: BUG: unable to handle kernel paging request in socket_file_ops

2017-12-20 Thread Eric Biggers
On Wed, Dec 20, 2017 at 12:51:01PM -0800, syzbot wrote:
> Hello,
> 
> syzkaller hit the following crash on
> 6084b576dca2e898f5c101baef151f7bfdbb606d
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
> C reproducer is attached
> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> for information about syzkaller reproducers
> 
> 
> alloc_fd: slot 80 not NULL!
> BUG: unable to handle kernel paging request at 
> alloc_fd: slot 81 not NULL!
> alloc_fd: slot 82 not NULL!
> alloc_fd: slot 83 not NULL!
> alloc_fd: slot 84 not NULL!
> alloc_fd: slot 86 not NULL!
> alloc_fd: slot 87 not NULL!
> IP: socket_file_ops+0x22/0x4d0
> PGD 3021067 P4D 3021067 PUD 3023067 PMD 0
> Oops: 0002 [#1] SMP
> Dumping ftrace buffer:
>(ftrace buffer empty)
> Modules linked in:
> CPU: 1 PID: 3358 Comm: cryptomgr_test Not tainted
> 4.15.0-rc3-next-20171214+ #67
> Hardware name: Google Google Compute Engine/Google Compute Engine,
> BIOS Google 01/01/2011
> RIP: 0010:socket_file_ops+0x22/0x4d0
> RSP: 0018:c900017fbdf0 EFLAGS: 00010246
> RAX: 880214e4ca00 RBX: 8802156c74a0 RCX: 81678ac3
> RDX:  RSI: 8802156c74a0 RDI: 8802156c74a0
> RBP: c900017fbe18 R08:  R09: 
> R10:  R11:  R12: 
> R13: c900017fbeb0 R14: c900017fbeb0 R15: c900017fbeb0
> FS:  () GS:88021fd0() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2:  CR3: 0301e002 CR4: 001606e0
> DR0:  DR1:  DR2: 
> DR3:  DR6: fffe0ff0 DR7: 0400
> Call Trace:
>  crypto_free_instance+0x2a/0x50 crypto/algapi.c:77
>  crypto_destroy_instance+0x1e/0x30 crypto/algapi.c:85
>  crypto_alg_put crypto/internal.h:116 [inline]
>  crypto_remove_final+0x73/0xa0 crypto/algapi.c:331
>  crypto_alg_tested+0x194/0x260 crypto/algapi.c:320
>  cryptomgr_test+0x17/0x30 crypto/algboss.c:226
>  kthread+0x149/0x170 kernel/kthread.c:238
>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:524
> Code: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50 51 40 81
> ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f8
> <09> 82 ff ff ff ff 00 26 0a 82 ff ff ff ff 00 00 00 00 00 00 00
> RIP: socket_file_ops+0x22/0x4d0 RSP: c900017fbdf0
> CR2: 
> ---[ end trace 52c47d77c1a058d5 ]---
> BUG: unable to handle kernel NULL pointer dereference at 0064
> IP: __neigh_event_send+0xa8/0x400 net/core/neighbour.c:1006
> PGD 0 P4D 0
> Oops:  [#2] SMP
> Dumping ftrace buffer:
>(ftrace buffer empty)
> Modules linked in:
> CPU: 0 PID: 3122 Comm: sshd Tainted: G  D
> 4.15.0-rc3-next-20171214+ #67
> Hardware name: Google Google Compute Engine/Google Compute Engine,
> BIOS Google 01/01/2011
> RIP: 0010:__neigh_event_send+0xa8/0x400 net/core/neighbour.c:1006
> RSP: 0018:c9efb8b8 EFLAGS: 00010293
> RAX: 880214dba640 RBX: 8802156c4c00 RCX: 820e6fa4
> RDX:  RSI:  RDI: 8802156c4c28
> RBP: c9efb8f8 R08: 0001 R09: 820e6f28
> R10: c9efb828 R11:  R12: 8802156c4c28
> R13: 8802115896e0 R14:  R15: 82e2eaf8
> FS:  7f838bacb7c0() GS:88021fc0() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 0064 CR3: 000213530006 CR4: 001606f0
> DR0:  DR1:  DR2: 
> DR3:  DR6: fffe0ff0 DR7: 0400
> Call Trace:
>  neigh_event_send include/net/neighbour.h:435 [inline]
>  neigh_resolve_output+0x24a/0x340 net/core/neighbour.c:1334
>  neigh_output include/net/neighbour.h:482 [inline]
>  ip_finish_output2+0x2cf/0x7b0 net/ipv4/ip_output.c:229
>  ip_finish_output+0x2e6/0x490 net/ipv4/ip_output.c:317
>  NF_HOOK_COND include/linux/netfilter.h:270 [inline]
>  ip_output+0x73/0x2b0 net/ipv4/ip_output.c:405
>  dst_output include/net/dst.h:443 [inline]
>  ip_local_out+0x54/0xb0 net/ipv4/ip_output.c:124
>  ip_queue_xmit+0x27d/0x740 net/ipv4/ip_output.c:504
>  tcp_transmit_skb+0x66a/0xd70 net/ipv4/tcp_output.c:1176
>  tcp_write_xmit+0x262/0x13a0 net/ipv4/tcp_output.c:2367
>  __tcp_push_pending_frames+0x49/0xe0 net/ipv4/tcp_output.c:2540
>  tcp_push+0x14e/0x190 net/ipv4/tcp.c:730
>  tcp_sendmsg_locked+0x899/0x11a0 net/ipv4/tcp.c:1424
>  tcp_sendmsg+0x2f/0x50 net/ipv4/tcp.c:1461
>  inet_sendmsg+0x54/0x250 net/ipv4/af_inet.c:763
>  sock_sendmsg_nosec net/socket.c:636 [inline]
>  sock_sendmsg+0x51/0x70 net/socket.c:646
>  sock_write_iter+0xa4/0x100 net/socket.c:915
>  call_write_iter include/linux/fs.h:1776 [inline]
>  new_sync_write fs/read_write.c:469 [inline]
>  __vfs_write+0x15b/0x

Re: KASAN: use-after-free Read in crypto_aead_free_instance

2017-12-20 Thread Eric Biggers
On Tue, Dec 19, 2017 at 11:48:01PM -0800, syzbot wrote:
> Hello,
> 
> syzkaller hit the following crash on
> 032b4cc8ff84490c4bc7c4ef8c91e6d83a637538
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
> C reproducer is attached
> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> for information about syzkaller reproducers
> 
> 
> ==
> BUG: KASAN: use-after-free in crypto_aead_free_instance+0xc0/0xd0
> crypto/aead.c:154
> Read of size 8 at addr 8801c32cf240 by task cryptomgr_test/6646
> 
> CPU: 1 PID: 6646 Comm: cryptomgr_test Not tainted 4.15.0-rc3+ #132
> Hardware name: Google Google Compute Engine/Google Compute Engine,
> BIOS Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:17 [inline]
>  dump_stack+0x194/0x257 lib/dump_stack.c:53
>  print_address_description+0x73/0x250 mm/kasan/report.c:252
>  kasan_report_error mm/kasan/report.c:351 [inline]
>  kasan_report+0x25b/0x340 mm/kasan/report.c:409
>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:430
>  crypto_aead_free_instance+0xc0/0xd0 crypto/aead.c:154
>  crypto_free_instance+0x6d/0x100 crypto/algapi.c:77
>  crypto_destroy_instance+0x3c/0x80 crypto/algapi.c:85
>  crypto_alg_put crypto/internal.h:116 [inline]
>  crypto_remove_final+0x212/0x370 crypto/algapi.c:331
>  crypto_alg_tested+0x445/0x6f0 crypto/algapi.c:320
>  cryptomgr_test+0x17/0x30 crypto/algboss.c:226
>  kthread+0x37a/0x440 kernel/kthread.c:238
>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:441
> 
> Allocated by task 6641:
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>  set_track mm/kasan/kasan.c:459 [inline]
>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
>  kmem_cache_alloc_trace+0x136/0x750 mm/slab.c:3610
>  kmalloc include/linux/slab.h:499 [inline]
>  kzalloc include/linux/slab.h:688 [inline]
>  pcrypt_create_aead crypto/pcrypt.c:291 [inline]
>  pcrypt_create+0x137/0x6c0 crypto/pcrypt.c:346
>  cryptomgr_probe+0x74/0x240 crypto/algboss.c:75
>  kthread+0x37a/0x440 kernel/kthread.c:238
>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:441
> 
> Freed by task 3335:
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>  set_track mm/kasan/kasan.c:459 [inline]
>  kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524
>  __cache_free mm/slab.c:3488 [inline]
>  kfree+0xca/0x250 mm/slab.c:3803
>  crypto_larval_destroy+0x110/0x150 crypto/api.c:107
>  crypto_alg_put crypto/internal.h:116 [inline]
>  crypto_larval_kill+0x1e8/0x2e0 crypto/api.c:167
>  crypto_alg_mod_lookup+0x178/0x1b0 crypto/api.c:283
>  crypto_find_alg crypto/api.c:501 [inline]
>  crypto_alloc_tfm+0xf3/0x2f0 crypto/api.c:534
>  crypto_alloc_aead+0x2c/0x40 crypto/aead.c:342
>  aead_bind+0x70/0x140 crypto/algif_aead.c:482
>  alg_bind+0x1ab/0x440 crypto/af_alg.c:179
>  SYSC_bind+0x1b4/0x3f0 net/socket.c:1454
>  SyS_bind+0x24/0x30 net/socket.c:1440
>  do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
>  do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
>  entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125
> 

Probably the pcrypt_free() bug.

#syz dup: KASAN: use-after-free Read in __list_del_entry_valid (2)


Re: BUG: unable to handle kernel NULL pointer dereference in __crypto_alg_lookup

2017-12-20 Thread Eric Biggers
On Mon, Dec 18, 2017 at 07:25:41AM +0100, Stephan Mueller wrote:
> Am Montag, 18. Dezember 2017, 06:50:01 CET schrieb syzbot:
> 
> Hi,
> 
> > Hello,
> > 
> > syzkaller hit the following crash on
> > 41d8c16909ebda40f7b4982a7f5e2ad102705ade
> > git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> > compiler: gcc (GCC) 7.1.1 20170620
> > .config is attached
> > Raw console output is attached.
> > C reproducer is attached
> > syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> > for information about syzkaller reproducers
> > 
> > 
> > BUG: unable to handle kernel NULL pointer dereference at 0020
> > IP: __crypto_alg_lookup+0x43/0x190 crypto/api.c:63
> > PGD 0 P4D 0
> > Oops:  [#1] SMP
> > Dumping ftrace buffer:
> > (ftrace buffer empty)
> > Modules linked in:
> > CPU: 1 PID: 3321 Comm: cryptomgr_probe Not tainted
> > 4.15.0-rc3-next-20171213+ #66
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > Google 01/01/2011
> > RIP: 0010:__crypto_alg_lookup+0x43/0x190 crypto/api.c:63
> > RSP: 0018:c900017ebd28 EFLAGS: 00010293
> > RAX: 880216762080 RBX:  RCX: 81673d23
> > RDX:  RSI: 0403 RDI: 880214eba120
> > RBP: c900017ebd70 R08: 0001 R09: 0001
> > R10: c900017ebcf0 R11:  R12: 
> > R13:  R14: 240f R15: 0403
> > FS:  () GS:88021fd0() knlGS:
> > CS:  0010 DS:  ES:  CR0: 80050033
> > CR2: 0020 CR3: 00020dc6b000 CR4: 001406e0
> > DR0:  DR1:  DR2: 
> > DR3:  DR6: fffe0ff0 DR7: 0400
> > Call Trace:
> >   crypto_alg_lookup+0x31/0x50 crypto/api.c:201
> >   crypto_larval_lookup.part.8+0x34/0x1c0 crypto/api.c:218
> >   crypto_larval_lookup crypto/api.c:212 [inline]
> >   crypto_alg_mod_lookup+0x77/0x120 crypto/api.c:271
> >   crypto_find_alg+0x63/0x70 crypto/api.c:501
> >   crypto_grab_spawn+0x2f/0x80 crypto/algapi.c:631
> >   crypto_grab_aead+0x35/0x40 crypto/aead.c:336
> >   pcrypt_create_aead crypto/pcrypt.c:298 [inline]
> >   pcrypt_create+0xca/0x2c0 crypto/pcrypt.c:346
> >   cryptomgr_probe+0x40/0x100 crypto/algboss.c:75
> >   kthread+0x149/0x170 kernel/kthread.c:238
> >   ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:524
> > Code: 89 7d d0 e8 70 66 c4 ff 48 8b 1d 59 e4 a6 01 48 81 fb 60 21 0e 83 0f
> > 84 4c 01 00 00 c7 45 c4 fe ff ff ff 45 31 e4 e8 4d 66 c4 ff <44> 8b 6b 20
> > 41 f6 c5 60 0f 85 03 01 00 00 e8 3a 66 c4 ff 44 89
> > RIP: __crypto_alg_lookup+0x43/0x190 crypto/api.c:63 RSP: c900017ebd28
> > CR2: 0020
> > ---[ end trace ff8e3661b9c3aa04 ]---
> > Kernel panic - not syncing: Fatal exception
> > Dumping ftrace buffer:
> > (ftrace buffer empty)
> > Kernel Offset: disabled
> > Rebooting in 86400 seconds..
> 
> This bug seems to be triggerable again by the fact that some strange 
> mask/type 
> combo is used. When setting it to zero, there is no crash.
> 
> Therefore I would see that this issue would be covered with the fix that we 
> currently work on to limit the number of allowed mask/type values.
> 

AF_ALG definitely should be locked down more, but we should fix the bugs before
they get hidden; after all, they are likely reachable in other ways as well.
There is a bug in pcrypt which seems to be causing some of these reports.  I've
sent out a fix and will mark the reports as a duplicate of the first:

#syz dup: KASAN: use-after-free Read in __list_del_entry_valid (2)

(It's a bit messy because the bug is causing a bogus pointer to be kfree()'d,
which can result in different symptoms, especially with slab merging.)


[PATCH] crypto: pcrypt - fix freeing pcrypt instances

2017-12-20 Thread Eric Biggers
From: Eric Biggers 

pcrypt is using the old way of freeing instances, where the ->free()
method specified in the 'struct crypto_template' is passed a pointer to
the 'struct crypto_instance'.  But the crypto_instance is being
kfree()'d directly, which is incorrect because the memory was actually
allocated as an aead_instance, which contains the crypto_instance at a
nonzero offset.  Thus, the wrong pointer was being kfree()'d.

Fix it by switching to the new way to free aead_instance's where the
->free() method is specified in the aead_instance itself.

Reported-by: syzbot 
Fixes: 0496f56065e0 ("crypto: pcrypt - Add support for new AEAD interface")
Cc:  # v4.2+
Signed-off-by: Eric Biggers 
---
 crypto/pcrypt.c | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/crypto/pcrypt.c b/crypto/pcrypt.c
index ee9cfb99fe25..f8ec3d4ba4a8 100644
--- a/crypto/pcrypt.c
+++ b/crypto/pcrypt.c
@@ -254,6 +254,14 @@ static void pcrypt_aead_exit_tfm(struct crypto_aead *tfm)
crypto_free_aead(ctx->child);
 }
 
+static void pcrypt_free(struct aead_instance *inst)
+{
+   struct pcrypt_instance_ctx *ctx = aead_instance_ctx(inst);
+
+   crypto_drop_aead(&ctx->spawn);
+   kfree(inst);
+}
+
 static int pcrypt_init_instance(struct crypto_instance *inst,
struct crypto_alg *alg)
 {
@@ -319,6 +327,8 @@ static int pcrypt_create_aead(struct crypto_template *tmpl, 
struct rtattr **tb,
inst->alg.encrypt = pcrypt_aead_encrypt;
inst->alg.decrypt = pcrypt_aead_decrypt;
 
+   inst->free = pcrypt_free;
+
err = aead_register_instance(tmpl, inst);
if (err)
goto out_drop_aead;
@@ -349,14 +359,6 @@ static int pcrypt_create(struct crypto_template *tmpl, 
struct rtattr **tb)
return -EINVAL;
 }
 
-static void pcrypt_free(struct crypto_instance *inst)
-{
-   struct pcrypt_instance_ctx *ctx = crypto_instance_ctx(inst);
-
-   crypto_drop_aead(&ctx->spawn);
-   kfree(inst);
-}
-
 static int pcrypt_cpumask_change_notify(struct notifier_block *self,
unsigned long val, void *data)
 {
@@ -469,7 +471,6 @@ static void pcrypt_fini_padata(struct padata_pcrypt *pcrypt)
 static struct crypto_template pcrypt_tmpl = {
.name = "pcrypt",
.create = pcrypt_create,
-   .free = pcrypt_free,
.module = THIS_MODULE,
 };
 
-- 
2.15.1.620.gb9897f4670-goog



Re: [PATCH v2 2/2] crypto: Fix out-of-bounds access of the AAD buffer in generic-gcm-aesni

2017-12-20 Thread Junaid Shahid
On Wednesday, December 20, 2017 1:12:54 PM PST Eric Biggers wrote:
> > 
> > We do need both registers, though we could certainly swap their usage to 
> > make
> > r12 the temp register. The reason we need the second register is because we
> > need to keep the original length to perform the pshufb at the end. But, of
> > course, that will not be needed anymore if we avoid the pshufb by 
> > duplicating
> > the _read_last_lt8 block or utilizing pslldq some other way.
> > 
> 
> If READ_PARTIAL_BLOCK can clobber 'DLEN' that would simplify it even more (no
> need for 'TMP1'), but what I am talking about here is how INITIAL_BLOCKS_DEC 
> and
> INITIAL_BLOCKS_ENC maintain two copies of the remaining length in lock-step in
> r11 and r12:
> 
> _get_AAD_blocks\num_initial_blocks\operation:
> movdqu (%r10), %xmm\i
> PSHUFB_XMM %xmm14, %xmm\i # byte-reflect the AAD data
> pxor   %xmm\i, \XMM2
> GHASH_MUL  \XMM2, \TMP3, \TMP1, \TMP2, \TMP4, \TMP5, \XMM1
> add$16, %r10
> sub$16, %r12
> sub$16, %r11
> cmp$16, %r11
> jge_get_AAD_blocks\num_initial_blocks\operation
> 
> The code which you are replacing with READ_PARTIAL_BLOCK actually needed the 
> two
> copies, but now it seems that only one copy is needed, so it can be simplified
> by only using r11.
> 

Sorry, I misunderstood earlier. I’ll remove the extra register from the 
preceding code in INIITIAL_BLOCKS_ENC/DEC.

Thanks,
Junaid


Re: [PATCH] [RFT] crypto: aes-generic - turn off -ftree-pre and -ftree-sra

2017-12-20 Thread Jakub Jelinek
On Wed, Dec 20, 2017 at 09:52:05PM +0100, Arnd Bergmann wrote:
> diff --git a/crypto/aes_generic.c b/crypto/aes_generic.c
> index ca554d57d01e..35f973ba9878 100644
> --- a/crypto/aes_generic.c
> +++ b/crypto/aes_generic.c
> @@ -1331,6 +1331,20 @@ EXPORT_SYMBOL_GPL(crypto_aes_set_key);
>   f_rl(bo, bi, 3, k); \
>  } while (0)
>  
> +#if __GNUC__ >= 7
> +/*
> + * Newer compilers try to optimize integer arithmetic more aggressively,
> + * which generally improves code quality a lot, but in this specific case
> + * ends up hurting more than it helps, in some configurations drastically
> + * so. This turns off two optimization steps that have been shown to
> + * lead to rather badly optimized code with gcc-7.
> + *
> + * See also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356
> + */
> +#pragma GCC optimize("-fno-tree-pre")
> +#pragma GCC optimize("-fno-tree-sra")

So do it only when UBSAN is enabled?  GCC doesn't have a particular
predefined macro for those (only for asan and tsan), but either the kernel
does have something already, or could have something added in the
corresponding Makefile.

Jakub


Re: [PATCH] [RFT] crypto: aes-generic - turn off -ftree-pre and -ftree-sra

2017-12-20 Thread Arnd Bergmann
On Wed, Dec 20, 2017 at 10:14 PM, Ard Biesheuvel
 wrote:
> On 20 December 2017 at 20:52, Arnd Bergmann  wrote:
>
> You can use the tcrypt.ko module to benchmark AES.
>
> modprobe tcrypt mode=200 sec=1

Ok, that's what I was looking for. I don't think I'll have time to
analyze this before
my Christmas break (I'm only here one more day, and have not set up a test
environment for this)

> to run a (lengthy) AES benchmark in various modes. AES-128 in ECB mode
> using the largest block size tested is what I usually use for
> comparison.
>
> On my Cortex-A57, the generic AES code runs at ~18 cycles per byte.
> Note that we have alternative scalar implementations on ARM and arm64
> that are faster so the performance of aes-generic is not really
> relevant (and so it is essentially dead code)

Ok. arm64 is also the least affected by this problem out of all architectures,
but it most architectures don't have an optimized implementation.

  Arnd


Re: [PATCH 1/2] padata: Remove FSF address

2017-12-20 Thread Cheah Kok Cheong
On Wed, Dec 20, 2017 at 09:20:48PM +0100, Philippe Ombredanne wrote:
> On Wed, Dec 20, 2017 at 9:15 PM, Cheah Kok Cheong  wrote:
> > Remove FSF address otherwise checkpatch will flag my next patch.
> >
> > Signed-off-by: Cheah Kok Cheong 
> > ---
> >  kernel/padata.c | 4 
> >  1 file changed, 4 deletions(-)
> >
> > diff --git a/kernel/padata.c b/kernel/padata.c
> > index 57c0074..9d91909 100644
> > --- a/kernel/padata.c
> > +++ b/kernel/padata.c
> > @@ -14,10 +14,6 @@
> >   * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> >   * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License 
> > for
> >   * more details.
> > - *
> > - * You should have received a copy of the GNU General Public License along 
> > with
> > - * this program; if not, write to the Free Software Foundation, Inc.,
> > - * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
> >   */
> >
> >  #include 
> > --
> > 2.7.4
> >
> 
> 
> Why not use instead the simpler SPDX ids?
> -- 
> Cordially
> Philippe Ombredanne

Hi Philippe,
  Adding the SPDX id can be the subject of a separate patch.
Believe you are part of the team doing an audit in this matter.
I shall leave adding the SPDX id to you guys the professionals.
Looks like you guys are using script to do this en masse.

But if you insist and our maintainer is agreeable, I can send
another patch to add the SPDX id.

Thanks.
Brgds,
CheahKC 


Re: [PATCH] [RFT] crypto: aes-generic - turn off -ftree-pre and -ftree-sra

2017-12-20 Thread Ard Biesheuvel
On 20 December 2017 at 20:52, Arnd Bergmann  wrote:
> While testing other changes, I discovered that gcc-7.2.1 produces badly
> optimized code for aes_encrypt/aes_decrypt. This is especially true when
> CONFIG_UBSAN_SANITIZE_ALL is enabled, where it leads to extremely
> large stack usage that in turn might cause kernel stack overflows:
>
> crypto/aes_generic.c: In function 'aes_encrypt':
> crypto/aes_generic.c:1371:1: warning: the frame size of 4880 bytes is larger 
> than 2048 bytes [-Wframe-larger-than=]
> crypto/aes_generic.c: In function 'aes_decrypt':
> crypto/aes_generic.c:1441:1: warning: the frame size of 4864 bytes is larger 
> than 2048 bytes [-Wframe-larger-than=]
>
> I verified that this problem exists on all architectures that are
> supported by gcc-7.2, though arm64 in particular is less affected than
> the others. I also found that gcc-7.1 and gcc-8 do not show the extreme
> stack usage but still produce worse code than earlier versions for this
> file, apparently because of optimization passes that generally provide
> a substantial improvement in object code quality but understandably fail
> to find any shortcuts in the AES algorithm.
>
> Turning off the tree-pre and tree-sra optimization steps seems to
> reverse the effect, and could be used as a workaround in case we
> don't get a good gcc fix.
>
> Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356
> Cc: Richard Biener 
> Cc: Jakub Jelinek 
> Signed-off-by: Arnd Bergmann 
> ---
> Jakub and Richard have done a more detailed analysis of this, and are
> working on ways to improve the code again. In the meantime, I'm sending
> this workaround to the Linux crypto maintainers to make them aware of
> this issue and for testing.
>
> What would be a good way to test the performance of the AES code with
> the various combinations of compiler versions, as well as UBSAN and this
> patch enabled or disabled?

You can use the tcrypt.ko module to benchmark AES.

modprobe tcrypt mode=200 sec=1

to run a (lengthy) AES benchmark in various modes. AES-128 in ECB mode
using the largest block size tested is what I usually use for
comparison.

On my Cortex-A57, the generic AES code runs at ~18 cycles per byte.
Note that we have alternative scalar implementations on ARM and arm64
that are faster so the performance of aes-generic is not really
relevant (and so it is essentially dead code)


> ---
>  crypto/aes_generic.c | 14 ++
>  1 file changed, 14 insertions(+)
>
> diff --git a/crypto/aes_generic.c b/crypto/aes_generic.c
> index ca554d57d01e..35f973ba9878 100644
> --- a/crypto/aes_generic.c
> +++ b/crypto/aes_generic.c
> @@ -1331,6 +1331,20 @@ EXPORT_SYMBOL_GPL(crypto_aes_set_key);
> f_rl(bo, bi, 3, k); \
>  } while (0)
>
> +#if __GNUC__ >= 7
> +/*
> + * Newer compilers try to optimize integer arithmetic more aggressively,
> + * which generally improves code quality a lot, but in this specific case
> + * ends up hurting more than it helps, in some configurations drastically
> + * so. This turns off two optimization steps that have been shown to
> + * lead to rather badly optimized code with gcc-7.
> + *
> + * See also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356
> + */
> +#pragma GCC optimize("-fno-tree-pre")
> +#pragma GCC optimize("-fno-tree-sra")
> +#endif
> +
>  static void aes_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
>  {
> const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
> --
> 2.9.0
>


Re: [PATCH v2 2/2] crypto: Fix out-of-bounds access of the AAD buffer in generic-gcm-aesni

2017-12-20 Thread Eric Biggers
On Wed, Dec 20, 2017 at 11:35:44AM -0800, Junaid Shahid wrote:
> On Wednesday, December 20, 2017 12:42:10 AM PST Eric Biggers wrote:
> > > -_get_AAD_rest0\num_initial_blocks\operation:
> > > - /* finalize: shift out the extra bytes we read, and align
> > > - left. since pslldq can only shift by an immediate, we use
> > > - vpshufb and an array of shuffle masks */
> > > - movq   %r12, %r11
> > > - salq   $4, %r11
> > > - movdqu aad_shift_arr(%r11), \TMP1
> > > - PSHUFB_XMM \TMP1, %xmm\i
> > 
> > aad_shift_arr is no longer used, so it should be removed.
> 
> Ack.
> 
> > 
> > > -_get_AAD_rest_final\num_initial_blocks\operation:
> > > + READ_PARTIAL_BLOCK %r10, %r12, %r11, \TMP1, \TMP2, %xmm\i
> > 
> > It seems that INITIAL_BLOCKS_DEC and INITIAL_BLOCKS_ENC maintain both %r11 
> > and
> > %r12 as the AAD length.  %r11 is used for real earlier, but here %r12 is 
> > used
> > for real while %r11 is a temporary register.  Can this be simplified to 
> > have the
> > AAD length in %r11 only?
> > 
> 
> We do need both registers, though we could certainly swap their usage to make
> r12 the temp register. The reason we need the second register is because we
> need to keep the original length to perform the pshufb at the end. But, of
> course, that will not be needed anymore if we avoid the pshufb by duplicating
> the _read_last_lt8 block or utilizing pslldq some other way.
> 

If READ_PARTIAL_BLOCK can clobber 'DLEN' that would simplify it even more (no
need for 'TMP1'), but what I am talking about here is how INITIAL_BLOCKS_DEC and
INITIAL_BLOCKS_ENC maintain two copies of the remaining length in lock-step in
r11 and r12:

_get_AAD_blocks\num_initial_blocks\operation:
movdqu (%r10), %xmm\i
PSHUFB_XMM %xmm14, %xmm\i # byte-reflect the AAD data
pxor   %xmm\i, \XMM2
GHASH_MUL  \XMM2, \TMP3, \TMP1, \TMP2, \TMP4, \TMP5, \XMM1
add$16, %r10
sub$16, %r12
sub$16, %r11
cmp$16, %r11
jge_get_AAD_blocks\num_initial_blocks\operation

The code which you are replacing with READ_PARTIAL_BLOCK actually needed the two
copies, but now it seems that only one copy is needed, so it can be simplified
by only using r11.

Eric


Re: [PATCH v2 1/2] crypto: Fix out-of-bounds access of the data buffer in generic-gcm-aesni

2017-12-20 Thread Eric Biggers
On Wed, Dec 20, 2017 at 11:28:27AM -0800, Junaid Shahid wrote:
> > > + # adjust the shuffle mask pointer to be able to shift either 0 or 8
> > > + # bytes depending on whether the last block is <8 bytes or not
> > > +mov \DLEN, \TMP1
> > > +and $8, \TMP1
> > > + lea SHIFT_MASK(%rip), %rax
> > > + sub \TMP1, %rax
> > > + movdqu (%rax), \XMM2# get the appropriate shuffle mask
> > > + PSHUFB_XMM \XMM2, \XMM1 # shift left either 0 or 8 bytes
> > 
> > Is there any way this can be simplified to use pslldq?  The shift is just 8
> > bytes or nothing, and we already had to branch earlier depending on whether 
> > we
> > needed to read the 8 bytes or not.
> 
> I am not sure if we can use a simple pslldq without either introducing another
> branch, or duplicating the _read_last_lt8 block for each case of the original
> branch. Do you think that it is better to just duplicate the _read_last_lt8
> block instead of using pshufb? Or do you have any other suggestion about how
> to do it?
> 

The best I can come up with now is just duplicating the "read one byte at a
time" instructions (see below).  One way to avoid the duplication would be to
read the 1-7 byte remainder (end of the block) first and the 8-byte word
(beginning of the block) second, but it would be a bit weird.

# read the last <16 byte block
# Clobbers %rax, TMP1 and XMM1
.macro READ_PARTIAL_BLOCK DPTR DLEN TMP1 XMM1 XMMDst
mov \DLEN, \TMP1
cmp $8, \DLEN
jl _read_partial_lt8_\@
mov (\DPTR), %rax
MOVQ_R64_XMM %rax, \XMMDst
sub $8, \TMP1
jz _done_read_partial_\@
xor %rax, %rax
1:
shl $8, %rax
mov 7(\DPTR, \TMP1, 1), %al
dec \TMP1
jnz 1b
MOVQ_R64_XMM %rax, \XMM1
pslldq $8, \XMM1
por \XMM1, \XMMDst
jmp _done_read_partial_\@

_read_partial_lt8_\@:
xor %rax, %rax
1:
shl $8, %rax
mov -1(\DPTR, \TMP1, 1), %al
dec \TMP1
jnz 1b
MOVQ_R64_XMM %rax, \XMMDst
_done_read_partial_\@:
.endm


[PATCH] [RFT] crypto: aes-generic - turn off -ftree-pre and -ftree-sra

2017-12-20 Thread Arnd Bergmann
While testing other changes, I discovered that gcc-7.2.1 produces badly
optimized code for aes_encrypt/aes_decrypt. This is especially true when
CONFIG_UBSAN_SANITIZE_ALL is enabled, where it leads to extremely
large stack usage that in turn might cause kernel stack overflows:

crypto/aes_generic.c: In function 'aes_encrypt':
crypto/aes_generic.c:1371:1: warning: the frame size of 4880 bytes is larger 
than 2048 bytes [-Wframe-larger-than=]
crypto/aes_generic.c: In function 'aes_decrypt':
crypto/aes_generic.c:1441:1: warning: the frame size of 4864 bytes is larger 
than 2048 bytes [-Wframe-larger-than=]

I verified that this problem exists on all architectures that are
supported by gcc-7.2, though arm64 in particular is less affected than
the others. I also found that gcc-7.1 and gcc-8 do not show the extreme
stack usage but still produce worse code than earlier versions for this
file, apparently because of optimization passes that generally provide
a substantial improvement in object code quality but understandably fail
to find any shortcuts in the AES algorithm.

Turning off the tree-pre and tree-sra optimization steps seems to
reverse the effect, and could be used as a workaround in case we
don't get a good gcc fix.

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356
Cc: Richard Biener 
Cc: Jakub Jelinek 
Signed-off-by: Arnd Bergmann 
---
Jakub and Richard have done a more detailed analysis of this, and are
working on ways to improve the code again. In the meantime, I'm sending
this workaround to the Linux crypto maintainers to make them aware of
this issue and for testing.

What would be a good way to test the performance of the AES code with
the various combinations of compiler versions, as well as UBSAN and this
patch enabled or disabled?
---
 crypto/aes_generic.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/crypto/aes_generic.c b/crypto/aes_generic.c
index ca554d57d01e..35f973ba9878 100644
--- a/crypto/aes_generic.c
+++ b/crypto/aes_generic.c
@@ -1331,6 +1331,20 @@ EXPORT_SYMBOL_GPL(crypto_aes_set_key);
f_rl(bo, bi, 3, k); \
 } while (0)
 
+#if __GNUC__ >= 7
+/*
+ * Newer compilers try to optimize integer arithmetic more aggressively,
+ * which generally improves code quality a lot, but in this specific case
+ * ends up hurting more than it helps, in some configurations drastically
+ * so. This turns off two optimization steps that have been shown to
+ * lead to rather badly optimized code with gcc-7.
+ *
+ * See also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356
+ */
+#pragma GCC optimize("-fno-tree-pre")
+#pragma GCC optimize("-fno-tree-sra")
+#endif
+
 static void aes_encrypt(struct crypto_tfm *tfm, u8 *out, const u8 *in)
 {
const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);
-- 
2.9.0



Re: [PATCH 1/2] padata: Remove FSF address

2017-12-20 Thread Philippe Ombredanne
On Wed, Dec 20, 2017 at 9:15 PM, Cheah Kok Cheong  wrote:
> Remove FSF address otherwise checkpatch will flag my next patch.
>
> Signed-off-by: Cheah Kok Cheong 
> ---
>  kernel/padata.c | 4 
>  1 file changed, 4 deletions(-)
>
> diff --git a/kernel/padata.c b/kernel/padata.c
> index 57c0074..9d91909 100644
> --- a/kernel/padata.c
> +++ b/kernel/padata.c
> @@ -14,10 +14,6 @@
>   * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>   * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>   * more details.
> - *
> - * You should have received a copy of the GNU General Public License along 
> with
> - * this program; if not, write to the Free Software Foundation, Inc.,
> - * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
>   */
>
>  #include 
> --
> 2.7.4
>


Why not use instead the simpler SPDX ids?
-- 
Cordially
Philippe Ombredanne


[PATCH 2/2] padata: Remove redundant export.h header

2017-12-20 Thread Cheah Kok Cheong
It is already included in module.h, see
commit f50169324df4 ("module.h: split out the EXPORT_SYMBOL into export.h")
for their relationship.

Signed-off-by: Cheah Kok Cheong 
---
 kernel/padata.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/kernel/padata.c b/kernel/padata.c
index 9d91909..e265953 100644
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -16,7 +16,6 @@
  * more details.
  */
 
-#include 
 #include 
 #include 
 #include 
-- 
2.7.4



[PATCH 1/2] padata: Remove FSF address

2017-12-20 Thread Cheah Kok Cheong
Remove FSF address otherwise checkpatch will flag my next patch.

Signed-off-by: Cheah Kok Cheong 
---
 kernel/padata.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/kernel/padata.c b/kernel/padata.c
index 57c0074..9d91909 100644
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -14,10 +14,6 @@
  * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
  * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
  * more details.
- *
- * You should have received a copy of the GNU General Public License along with
- * this program; if not, write to the Free Software Foundation, Inc.,
- * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
  */
 
 #include 
-- 
2.7.4



[PATCH RFC 2/3] crypto: Implement a generic crypto statistics

2017-12-20 Thread Corentin Labbe
This patch implement a generic way to get statistics about all crypto
usages.

Signed-off-by: Corentin Labbe 
---
 crypto/Kconfig |  11 +++
 crypto/ahash.c |  18 +
 crypto/algapi.c| 186 +
 crypto/rng.c   |   3 +
 include/crypto/acompress.h |  10 +++
 include/crypto/akcipher.h  |  12 +++
 include/crypto/kpp.h   |   9 +++
 include/crypto/rng.h   |   5 ++
 include/crypto/skcipher.h  |   8 ++
 include/linux/crypto.h |  22 ++
 10 files changed, 284 insertions(+)

diff --git a/crypto/Kconfig b/crypto/Kconfig
index d6e9b60fc063..69f1822a026b 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -1781,6 +1781,17 @@ config CRYPTO_USER_API_AEAD
  This option enables the user-spaces interface for AEAD
  cipher algorithms.
 
+config CRYPTO_STATS
+   bool "Crypto usage statistics for User-space"
+   help
+ This option enables the gathering of crypto stats.
+ This will collect:
+ - encrypt/decrypt size and numbers of symmeric operations
+ - compress/decompress size and numbers of compress operations
+ - size and numbers of hash operations
+ - encrypt/decrypt/sign/verify numbers for asymmetric operations
+ - generate/seed numbers for rng operations
+
 config CRYPTO_HASH_INFO
bool
 
diff --git a/crypto/ahash.c b/crypto/ahash.c
index 3a35d67de7d9..93b56892f1b8 100644
--- a/crypto/ahash.c
+++ b/crypto/ahash.c
@@ -356,18 +356,36 @@ static int crypto_ahash_op(struct ahash_request *req,
 
 int crypto_ahash_final(struct ahash_request *req)
 {
+#ifdef CONFIG_CRYPTO_STATS
+   struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+
+   tfm->base.__crt_alg->enc_cnt++;
+   tfm->base.__crt_alg->enc_tlen += req->nbytes;
+#endif
return crypto_ahash_op(req, crypto_ahash_reqtfm(req)->final);
 }
 EXPORT_SYMBOL_GPL(crypto_ahash_final);
 
 int crypto_ahash_finup(struct ahash_request *req)
 {
+#ifdef CONFIG_CRYPTO_STATS
+   struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+
+   tfm->base.__crt_alg->enc_cnt++;
+   tfm->base.__crt_alg->enc_tlen += req->nbytes;
+#endif
return crypto_ahash_op(req, crypto_ahash_reqtfm(req)->finup);
 }
 EXPORT_SYMBOL_GPL(crypto_ahash_finup);
 
 int crypto_ahash_digest(struct ahash_request *req)
 {
+#ifdef CONFIG_CRYPTO_STATS
+   struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+
+   tfm->base.__crt_alg->enc_cnt++;
+   tfm->base.__crt_alg->enc_tlen += req->nbytes;
+#endif
return crypto_ahash_op(req, crypto_ahash_reqtfm(req)->digest);
 }
 EXPORT_SYMBOL_GPL(crypto_ahash_digest);
diff --git a/crypto/algapi.c b/crypto/algapi.c
index b8f6122f37e9..4fca4576af78 100644
--- a/crypto/algapi.c
+++ b/crypto/algapi.c
@@ -20,11 +20,158 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "internal.h"
 
 static LIST_HEAD(crypto_template_list);
 
+#ifdef CONFIG_CRYPTO_STATS
+static struct kobject *crypto_root_kobj;
+
+static struct kobj_type dynamic_kobj_ktype = {
+   .sysfs_ops  = &kobj_sysfs_ops,
+};
+
+static ssize_t fcrypto_priority(struct kobject *kobj,
+   struct kobj_attribute *attr, char *buf)
+{
+   struct crypto_alg *alg;
+
+   alg = container_of(kobj, struct crypto_alg, cra_stat_obj);
+   return snprintf(buf, 9, "%d\n", alg->cra_priority);
+}
+
+static ssize_t fcrypto_stat_enc_cnt(struct kobject *kobj,
+   struct kobj_attribute *attr, char *buf)
+{
+   struct crypto_alg *alg;
+
+   alg = container_of(kobj, struct crypto_alg, cra_stat_obj);
+   return snprintf(buf, 9, "%lu\n", alg->enc_cnt);
+}
+
+static ssize_t fcrypto_stat_enc_tlen(struct kobject *kobj,
+struct kobj_attribute *attr, char *buf)
+{
+   struct crypto_alg *alg;
+
+   alg = container_of(kobj, struct crypto_alg, cra_stat_obj);
+   return snprintf(buf, 9, "%lu\n", alg->enc_tlen);
+}
+
+static ssize_t fcrypto_stat_dec_cnt(struct kobject *kobj,
+   struct kobj_attribute *attr, char *buf)
+{
+   struct crypto_alg *alg;
+
+   alg = container_of(kobj, struct crypto_alg, cra_stat_obj);
+   return snprintf(buf, 9, "%lu\n", alg->dec_cnt);
+}
+
+static ssize_t fcrypto_stat_dec_tlen(struct kobject *kobj,
+struct kobj_attribute *attr, char *buf)
+{
+   struct crypto_alg *alg;
+
+   alg = container_of(kobj, struct crypto_alg, cra_stat_obj);
+   return snprintf(buf, 9, "%lu\n", alg->dec_tlen);
+}
+
+static ssize_t fcrypto_stat_verify_cnt(struct kobject *kobj,
+  struct kobj_attribute *attr, char *buf)
+{
+   struct crypto_alg *alg;
+
+   alg = container_of(kobj, struct crypto_alg, cra_stat_obj);
+   return snprintf(buf, 9, "%lu\n", alg->verify_cnt);
+}
+
+static ssize_t fcrypto_stat_sign_cnt(struct kobject *kobj,
+

[PATCH RFC 1/3] crypto: Prevent to register duplicate cra_driver_name

2017-12-20 Thread Corentin Labbe
Each crypto algorithm "cra_name" can have multiple implementation called
"cra_driver_name".
If two different implementation have the same cra_driver_name, nothing
can easily differentiate them.
Furthermore the mechanism for getting a crypto algorithm with its
implementation name (crypto_alg_match() in crypto/crypto_user.c) will
get only the first one found.

So this patch prevent the registration of two implementation with the
same cra_driver_name.

Signed-off-by: Corentin Labbe 
---
 crypto/algapi.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/crypto/algapi.c b/crypto/algapi.c
index 60d7366ed343..b8f6122f37e9 100644
--- a/crypto/algapi.c
+++ b/crypto/algapi.c
@@ -208,6 +208,11 @@ static struct crypto_larval *__crypto_register_alg(struct 
crypto_alg *alg)
goto err;
continue;
}
+   if (!strcmp(q->cra_driver_name, alg->cra_driver_name)) {
+   pr_err("Cannot register since cra_driver_name %s is 
already used\n",
+  alg->cra_driver_name);
+   goto err;
+   }
 
if (!strcmp(q->cra_driver_name, alg->cra_name) ||
!strcmp(q->cra_name, alg->cra_driver_name))
-- 
2.13.6



[PATCH RFC 3/3] crypto: tools: Add cryptostat userspace

2017-12-20 Thread Corentin Labbe
Add an example tool for getting easily crypto statistics.

Signed-off-by: Corentin Labbe 
---
 tools/crypto/cryptostat | 40 
 1 file changed, 40 insertions(+)
 create mode 100755 tools/crypto/cryptostat

diff --git a/tools/crypto/cryptostat b/tools/crypto/cryptostat
new file mode 100755
index ..314e108eb608
--- /dev/null
+++ b/tools/crypto/cryptostat
@@ -0,0 +1,40 @@
+#!/bin/bash
+
+for dn in `ls /sys/kernel/crypto/`
+do
+   dnpath=/sys/kernel/crypto/$dn/
+   algtype=$(cat $dnpath/algtype)
+   algname=$(cat $dnpath/algname)
+   echo "$dn ($algtype) $algname"
+   case $algtype in
+   hash)
+   echo -e "\tHash: $(cat $dnpath/enc_cnt)\tbytes $(cat $dnpath/enc_tlen)"
+   ;;
+   cipher)
+   echo -e "\tEncrypt: $(cat $dnpath/enc_cnt)\tbytes $(cat 
$dnpath/enc_tlen)"
+   echo -e "\tDecrypt: $(cat $dnpath/dec_cnt)\tbytes $(cat 
$dnpath/dec_tlen)"
+   ;;
+   compress)
+   echo -e "\tCompress: $(cat $dnpath/enc_cnt)\tbytes $(cat 
$dnpath/enc_tlen)"
+   echo -e "\tDecompress: $(cat $dnpath/dec_cnt)\tbytes $(cat 
$dnpath/dec_tlen)"
+   ;;
+   asymetric)
+   echo -e "\tEncrypt: $(cat $dnpath/enc_cnt)\tbytes $(cat 
$dnpath/enc_tlen)"
+   echo -e "\tDecrypt: $(cat $dnpath/dec_cnt)\tbytes $(cat 
$dnpath/dec_tlen)"
+   echo -e "\tVerify: $(cat $dnpath/verify_cnt)"
+   echo -e "\tSign: $(cat $dnpath/sign_cnt)"
+   ;;
+   rng)
+   echo -e "\tGenerate: $(cat $dnpath/enc_cnt)\tbytes $(cat 
$dnpath/enc_tlen)"
+   echo -e "\tSeed: $(cat $dnpath/dec_cnt)"
+   ;;
+   kpp)
+   echo -e "\tSet_secret: $(cat $dnpath/dec_cnt)"
+   echo -e "\tGenerate_public_key: $(cat $dnpath/enc_cnt)"
+   echo -e "\tCompute_shared_secret: $(cat $dnpath/verify_cnt)"
+   ;;
+   *)
+   echo -e "\t$(cat $dnpath/enc_cnt)\tbytes $(cat $dnpath/enc_tlen)"
+   ;;
+   esac
+done
-- 
2.13.6



[PATCH RFC 0/3] crypto: Implement a generic crypto statistics

2017-12-20 Thread Corentin Labbe
Hello

This patch is a try to implement a generic crypto driver statistics.
The goal is to have an "ifconfig" for crypto device.

Some driver tried to implement this via a debugfs interface.

My proposed way is to embed this in the crypto framework by registering
a /sys/kernel/crypto tree and create a directory for each cra_driver_name.
Note that I assume than cra_driver_name will only exists once, which seems 
false for arm64 ctr-aes-ce.
But for me, it's a bug.
Each crypto algorithm "cra_name" can have multiple implementation called
"cra_driver_name".
If two different implementation have the same cra_driver_name, nothing
can easily differentiate them.
Furthermore the mechanism for getting a crypto algorithm with its
implementation name (crypto_alg_match() in crypto/crypto_user.c) will
get only the first one found.

Then an userspace tool will collect information in this tree.

Example of output:
./cryptostat
aes-arm (cipher) aes
Encrypt: 0  bytes 0
Decrypt: 0  bytes 0
aes-generic (cipher) aes
Encrypt: 0  bytes 0
Decrypt: 0  bytes 0
arc4-generic (cipher) arc4
Encrypt: 0  bytes 0
Decrypt: 0  bytes 0
cbc(aes-arm) (cipher) cbc(aes)
Encrypt: 94 bytes 6096
Decrypt: 94 bytes 6096
cbc-aes-sun8i-ce (cipher) cbc(aes)
Encrypt: 24 bytes 3712
Decrypt: 24 bytes 3712
cbc(des3_ede-generic) (cipher) cbc(des3_ede)
Encrypt: 14 bytes 5104
Decrypt: 14 bytes 5104
cbc-des3-sun8i-ce (cipher) cbc(des3_ede)
Encrypt: 10 bytes 3488
Decrypt: 10 bytes 3488
cipher_null-generic (cipher) cipher_null
Encrypt: 0  bytes 0
Decrypt: 0  bytes 0
compress_null-generic (compress) compress_null
Compress: 0 bytes 0
Decompress: 0   bytes 0
crc32c-generic (hash) crc32c
Hash: 88bytes 15826
crct10dif-generic (hash) crct10dif
Hash: 20bytes 2246
ctr(aes-arm) (cipher) ctr(aes)
Encrypt: 31 bytes 10225
Decrypt: 31 bytes 10225
ctr-aes-sun8i-ce (cipher) ctr(aes)
Encrypt: 24 bytes 6738
Decrypt: 24 bytes 6738
cts(cbc(aes-arm)) (cipher) cts(cbc(aes))
Encrypt: 33 bytes 1241
Decrypt: 33 bytes 1241
cts(cbc-aes-sun8i-ce) (cipher) cts(cbc(aes))
Encrypt: 24 bytes 956
Decrypt: 24 bytes 956
deflate-generic (compress) deflate
Compress: 0 bytes 0
Decompress: 0   bytes 0
deflate-scomp (compress) deflate
Compress: 2 bytes 261
Decompress: 4   bytes 320
des3_ede-generic (cipher) des3_ede
Encrypt: 0  bytes 0
Decrypt: 0  bytes 0
des-generic (cipher) des
Encrypt: 0  bytes 0
Decrypt: 0  bytes 0
dh-generic (kpp) dh
Set_secret: 2
Generate_public_key: 2
Compute_shared_secret: 2
digest_null-generic (hash) digest_null
Hash: 0 bytes 0
drbg_nopr_hmac_sha1 (rng) stdrng
Generate: 0 bytes 0
Seed: 0
drbg_nopr_hmac_sha256 (rng) stdrng
Generate: 8 bytes 1024
Seed: 4
drbg_nopr_hmac_sha384 (rng) stdrng
Generate: 0 bytes 0
Seed: 0
drbg_nopr_hmac_sha512 (rng) stdrng
Generate: 0 bytes 0
Seed: 0
drbg_pr_hmac_sha1 (rng) stdrng
Generate: 0 bytes 0
Seed: 0
drbg_pr_hmac_sha256 (rng) stdrng
Generate: 8 bytes 1024
Seed: 4
drbg_pr_hmac_sha384 (rng) stdrng
Generate: 0 bytes 0
Seed: 0
drbg_pr_hmac_sha512 (rng) stdrng
Generate: 0 bytes 0
Seed: 0
ecb(aes-arm) (cipher) ecb(aes)
Encrypt: 20 bytes 4160
Decrypt: 20 bytes 4160
ecb-aes-sun8i-ce (cipher) ecb(aes)
Encrypt: 18 bytes 3168
Decrypt: 18 bytes 3168
ecb(arc4)-generic (cipher) ecb(arc4)
Encrypt: 21 bytes 270
Decrypt: 21 bytes 270
ecb-cipher_null (cipher) ecb(cipher_null)
Encrypt: 0  bytes 0
Decrypt: 0  bytes 0
ecb(des3_ede-generic) (cipher) ecb(des3_ede)
Encrypt: 24 bytes 4584
Decrypt: 24 bytes 4584
ecb-des3-sun8i-ce (cipher) ecb(des3_ede)
Encrypt: 18 bytes 3072
Decrypt: 18 bytes 3072
hmac(sha256-neon) (hash) hmac(sha256)
Hash: 40bytes 1788
jitterentropy_rng (rng) jitterentropy_rng
Generate: 0 bytes 0
Seed: 0
lzo-generic (compress) lzo
Compress: 0 bytes 0
Decompress: 0   bytes 0
lzo-scomp (compress) lzo
Compress: 2 bytes 229
Decompress: 4   bytes 367
md5-generic (hash) md5
Hash: 28bytes 718
pkcs1pad(rsa-sun8i-ce,sha1) (asymetric) pkcs1pad(rsa,sha1)
Encrypt: 0  bytes 0
Decrypt: 0  bytes 0
Verify: 2
Sign: 0
rsa-generic (asymetric) rsa
Encrypt: 14 bytes 0
Decrypt: 9  bytes 0
Verify: 5
Sign: 0
rsa-sun8i-ce (asymetric) rsa
Encrypt: 7

Re: [PATCH] Fix NULL pointer deref. on no default_rng

2017-12-20 Thread Pierre
On Wednesday, November 29, 2017 5:34:30 PM CET Herbert Xu wrote:
> On Sun, Nov 12, 2017 at 03:24:32PM +0100, Pierre Ducroquet wrote:
> > If crypto_get_default_rng returns an error, the
> > function ecc_gen_privkey should return an error.
> > Instead, it currently tries to use the default_rng
> > nevertheless, thus creating a kernel panic with a
> > NULL pointer dereference.
> > Returning the error directly, as was supposedly
> > intended when looking at the code, fixes this.
> > 
> > Signed-off-by: Pierre Ducroquet 
> 
> Patch applied.  Thanks.

Hi

How long will it take for the patch to be merged in an official release ?
This simple patch fixes a kernel panic at boot, it would be great for it to be 
merged in 4.15.

Thanks
 Pierre

signature.asc
Description: This is a digitally signed message part.


Re: [PATCH v2 2/2] crypto: Fix out-of-bounds access of the AAD buffer in generic-gcm-aesni

2017-12-20 Thread Junaid Shahid
On Wednesday, December 20, 2017 12:42:10 AM PST Eric Biggers wrote:
> > -_get_AAD_rest0\num_initial_blocks\operation:
> > -   /* finalize: shift out the extra bytes we read, and align
> > -   left. since pslldq can only shift by an immediate, we use
> > -   vpshufb and an array of shuffle masks */
> > -   movq   %r12, %r11
> > -   salq   $4, %r11
> > -   movdqu aad_shift_arr(%r11), \TMP1
> > -   PSHUFB_XMM \TMP1, %xmm\i
> 
> aad_shift_arr is no longer used, so it should be removed.

Ack.

> 
> > -_get_AAD_rest_final\num_initial_blocks\operation:
> > +   READ_PARTIAL_BLOCK %r10, %r12, %r11, \TMP1, \TMP2, %xmm\i
> 
> It seems that INITIAL_BLOCKS_DEC and INITIAL_BLOCKS_ENC maintain both %r11 and
> %r12 as the AAD length.  %r11 is used for real earlier, but here %r12 is used
> for real while %r11 is a temporary register.  Can this be simplified to have 
> the
> AAD length in %r11 only?
> 

We do need both registers, though we could certainly swap their usage to make 
r12 the temp register. The reason we need the second register is because we 
need to keep the original length to perform the pshufb at the end. But, of 
course, that will not be needed anymore if we avoid the pshufb by duplicating 
the _read_last_lt8 block or utilizing pslldq some other way.

Thanks,
Junaid


Re: [PATCH v2 1/2] crypto: Fix out-of-bounds access of the data buffer in generic-gcm-aesni

2017-12-20 Thread Junaid Shahid
On Wednesday, December 20, 2017 12:36:16 AM PST Eric Biggers wrote:
> 
> Did you run the self-tests (boot with CONFIG_CRYPTO_MANAGER_DISABLE_TESTS
> unset)?  The second patch causes them to start failing:
> 
> [1.169640] alg: aead: Test 7 failed on encryption for rfc4106-gcm-aesni
> [1.178308] alg: aead: Test 4 failed on encryption for generic-gcm-aesni

Thanks for the pointer. Let me run the self-tests and see.

> 
> Also, don't the AVX and AVX2 versions have the same bug?  These patches only 
> fix
> the non-AVX version.

The AVX/AVX2 versions are used only when the data length is at least 640/4K 
bytes respectively. Therefore, the first bug doesn’t apply at all. The second 
bug does exist, but it won’t cause any ill effect at the present time because 
the overrun will be covered by the data bytes. However, I am planning to send 
out a separate patch series that allows for non-contiguous AAD, data and 
AuthTag buffers, which will cause the AAD overrun to manifest even in the AVX 
versions, so I am going to include the AVX version fixes in that series.

> 
> > +# read the last <16 byte block
> > +# Clobbers %rax, DPTR, TMP1 and XMM1-2
> > +.macro READ_PARTIAL_BLOCK DPTR DLEN TMP1 XMM1 XMM2 XMMDst
> > +pxor \XMMDst, \XMMDst
> > +mov \DLEN, \TMP1
> > +cmp $8, \DLEN
> > +jl _read_last_lt8_\@
> > +mov (\DPTR), %rax
> > +MOVQ_R64_XMM %rax, \XMMDst
> > +   add $8, \DPTR
> > +sub $8, \TMP1
> > +jz _done_read_last_partial_block_\@
> > +_read_last_lt8_\@:
> > +   shl $8, %rax
> > +mov -1(\DPTR, \TMP1, 1), %al
> > +dec \TMP1
> > +jnz _read_last_lt8_\@
> > +MOVQ_R64_XMM %rax, \XMM1
> 
> Doesn't %rax needs to be zeroed at '_read_last_lt8' so that the padding is
> zeroed?
> 

Ah, yes. It actually isn’t needed for the first patch, as in that case the 
result returned by this macro is masked appropriately at the call-sites anyway. 
But that is not the case for the second patch. That is probably also the reason 
for the failures that you noticed.

> > +   # adjust the shuffle mask pointer to be able to shift either 0 or 8
> > +   # bytes depending on whether the last block is <8 bytes or not
> > +mov \DLEN, \TMP1
> > +and $8, \TMP1
> > +   lea SHIFT_MASK(%rip), %rax
> > +   sub \TMP1, %rax
> > +   movdqu (%rax), \XMM2# get the appropriate shuffle mask
> > +   PSHUFB_XMM \XMM2, \XMM1 # shift left either 0 or 8 bytes
> 
> Is there any way this can be simplified to use pslldq?  The shift is just 8
> bytes or nothing, and we already had to branch earlier depending on whether we
> needed to read the 8 bytes or not.

I am not sure if we can use a simple pslldq without either introducing another 
branch, or duplicating the _read_last_lt8 block for each case of the original 
branch. Do you think that it is better to just duplicate the _read_last_lt8 
block instead of using pshufb? Or do you have any other suggestion about how to 
do it?

Thanks,
Junaid


Re: [PATCH] iommu/amd - Set the device table entry PPR bit for IOMMU V2 devices

2017-12-20 Thread Gary R Hook

Please ignore; sent to the wrong list. Mea culpa.

On 12/20/2017 10:57 AM, Gary R Hook wrote:

The AMD IOMMU specification Rev 3.00 (December 2016) introduces a
new Enhanced PPR Handling Support (EPHSup) bit in the MMIO register
offset 0030h (IOMMU Extended Feature Register).

When EPHSup=1, the IOMMU hardware requires the PPR bit of the
device table entry (DTE) to be set in order to support PPR for a
particular endpoint device.

Please see https://support.amd.com/TechDocs/48882_IOMMU.pdf for
this revision of the AMD IOMMU specification.

Signed-off-by: Gary R Hook 
---
  drivers/iommu/amd_iommu.c   |   20 +++-
  drivers/iommu/amd_iommu_types.h |2 ++
  2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index cb78933ef53f..5f3da95ff6a0 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1818,7 +1818,8 @@ static bool dma_ops_domain(struct protection_domain 
*domain)
return domain->flags & PD_DMA_OPS_MASK;
  }
  
-static void set_dte_entry(u16 devid, struct protection_domain *domain, bool ats)

+static void set_dte_entry(u16 devid, struct protection_domain *domain,
+ bool ats, bool ppr)
  {
u64 pte_root = 0;
u64 flags = 0;
@@ -1835,6 +1836,13 @@ static void set_dte_entry(u16 devid, struct 
protection_domain *domain, bool ats)
if (ats)
flags |= DTE_FLAG_IOTLB;
  
+	if (ppr) {

+   struct amd_iommu *iommu = amd_iommu_rlookup_table[devid];
+
+   if (iommu_feature(iommu, FEATURE_EPHSUP))
+   pte_root |= 1ULL << DEV_ENTRY_PPR;
+   }
+
if (domain->flags & PD_IOMMUV2_MASK) {
u64 gcr3 = iommu_virt_to_phys(domain->gcr3_tbl);
u64 glx  = domain->glx;
@@ -1897,9 +1905,9 @@ static void do_attach(struct iommu_dev_data *dev_data,
domain->dev_cnt += 1;
  
  	/* Update device table */

-   set_dte_entry(dev_data->devid, domain, ats);
+   set_dte_entry(dev_data->devid, domain, ats, dev_data->iommu_v2);
if (alias != dev_data->devid)
-   set_dte_entry(alias, domain, ats);
+   set_dte_entry(alias, domain, ats, dev_data->iommu_v2);
  
  	device_flush_dte(dev_data);

  }
@@ -2278,13 +2286,15 @@ static void update_device_table(struct 
protection_domain *domain)
struct iommu_dev_data *dev_data;
  
  	list_for_each_entry(dev_data, &domain->dev_list, list) {

-   set_dte_entry(dev_data->devid, domain, dev_data->ats.enabled);
+   set_dte_entry(dev_data->devid, domain, dev_data->ats.enabled,
+ dev_data->iommu_v2);
  
  		if (dev_data->devid == dev_data->alias)

continue;
  
  		/* There is an alias, update device table entry for it */

-   set_dte_entry(dev_data->alias, domain, dev_data->ats.enabled);
+   set_dte_entry(dev_data->alias, domain, dev_data->ats.enabled,
+ dev_data->iommu_v2);
}
  }
  
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h

index f6b24c7d8b70..6a877ebd058b 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -98,6 +98,7 @@
  #define FEATURE_HE(1ULL<<8)
  #define FEATURE_PC(1ULL<<9)
  #define FEATURE_GAM_VAPIC (1ULL<<21)
+#define FEATURE_EPHSUP (1ULL<<50)
  
  #define FEATURE_PASID_SHIFT	32

  #define FEATURE_PASID_MASK(0x1fULL << FEATURE_PASID_SHIFT)
@@ -192,6 +193,7 @@
  /* macros and definitions for device table entries */
  #define DEV_ENTRY_VALID 0x00
  #define DEV_ENTRY_TRANSLATION   0x01
+#define DEV_ENTRY_PPR   0x34
  #define DEV_ENTRY_IR0x3d
  #define DEV_ENTRY_IW0x3e
  #define DEV_ENTRY_NO_PAGE_FAULT   0x62





[PATCH] crypto: stm32 - Use standard CONFIG name

2017-12-20 Thread Corentin Labbe
All hardware crypto devices have their CONFIG names using the following
convention:
CRYPTO_DEV_name_algo

This patch apply this conventions on STM32 CONFIG names.

Signed-off-by: Corentin Labbe 
---
 drivers/crypto/stm32/Kconfig  | 6 +++---
 drivers/crypto/stm32/Makefile | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/crypto/stm32/Kconfig b/drivers/crypto/stm32/Kconfig
index 61ef00b6bf45..63aa78c0b12b 100644
--- a/drivers/crypto/stm32/Kconfig
+++ b/drivers/crypto/stm32/Kconfig
@@ -1,4 +1,4 @@
-config CRC_DEV_STM32
+config CRYPTO_DEV_STM32_CRC
tristate "Support for STM32 crc accelerators"
depends on ARCH_STM32
select CRYPTO_HASH
@@ -6,7 +6,7 @@ config CRC_DEV_STM32
   This enables support for the CRC32 hw accelerator which can be found
  on STMicroelectronics STM32 SOC.
 
-config HASH_DEV_STM32
+config CRYPTO_DEV_STM32_HASH
tristate "Support for STM32 hash accelerators"
depends on ARCH_STM32
depends on HAS_DMA
@@ -19,7 +19,7 @@ config HASH_DEV_STM32
   This enables support for the HASH hw accelerator which can be found
  on STMicroelectronics STM32 SOC.
 
-config CRYP_DEV_STM32
+config CRYPTO_DEV_STM32_CRYP
tristate "Support for STM32 cryp accelerators"
depends on ARCH_STM32
select CRYPTO_HASH
diff --git a/drivers/crypto/stm32/Makefile b/drivers/crypto/stm32/Makefile
index 2c19fc155bfd..53d1bb94b221 100644
--- a/drivers/crypto/stm32/Makefile
+++ b/drivers/crypto/stm32/Makefile
@@ -1,3 +1,3 @@
-obj-$(CONFIG_CRC_DEV_STM32) += stm32_crc32.o
-obj-$(CONFIG_HASH_DEV_STM32) += stm32-hash.o
-obj-$(CONFIG_CRYP_DEV_STM32) += stm32-cryp.o
+obj-$(CONFIG_CRYPTO_DEV_STM32_CRC) += stm32_crc32.o
+obj-$(CONFIG_CRYPTO_DEV_STM32_HASH) += stm32-hash.o
+obj-$(CONFIG_CRYPTO_DEV_STM32_CRYP) += stm32-cryp.o
-- 
2.13.6



[PATCH] staging: ccree: fix __dump_byte_array() declaration mismatch

2017-12-20 Thread Corentin Labbe
This patch corrects the type of the size argument in __dump_byte_array()
from unsigned long to size_t as done only in drivers/staging/ccree/ssi_driver.c

This fix also a build error:
drivers/staging/ccree/ssi_driver.c:82:6: error: conflicting types for 
'__dump_byte_array'

Fixes: 3f268f5d6669 ("staging: ccree: turn compile time debug log to params")
Signed-off-by: Corentin Labbe 
---
 drivers/staging/ccree/ssi_driver.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/staging/ccree/ssi_driver.h 
b/drivers/staging/ccree/ssi_driver.h
index 5a56f7a76b71..0f57c9a8b8a6 100644
--- a/drivers/staging/ccree/ssi_driver.h
+++ b/drivers/staging/ccree/ssi_driver.h
@@ -174,8 +174,7 @@ static inline struct device *drvdata_to_dev(struct 
cc_drvdata *drvdata)
return &drvdata->plat_dev->dev;
 }
 
-void __dump_byte_array(const char *name, const u8 *the_array,
-  unsigned long size);
+void __dump_byte_array(const char *name, const u8 *buf, size_t size);
 static inline void dump_byte_array(const char *name, const u8 *the_array,
   unsigned long size)
 {
-- 
2.13.6



[PATCH] iommu/amd - Set the device table entry PPR bit for IOMMU V2 devices

2017-12-20 Thread Gary R Hook
The AMD IOMMU specification Rev 3.00 (December 2016) introduces a
new Enhanced PPR Handling Support (EPHSup) bit in the MMIO register
offset 0030h (IOMMU Extended Feature Register).

When EPHSup=1, the IOMMU hardware requires the PPR bit of the
device table entry (DTE) to be set in order to support PPR for a
particular endpoint device.

Please see https://support.amd.com/TechDocs/48882_IOMMU.pdf for
this revision of the AMD IOMMU specification.

Signed-off-by: Gary R Hook 
---
 drivers/iommu/amd_iommu.c   |   20 +++-
 drivers/iommu/amd_iommu_types.h |2 ++
 2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index cb78933ef53f..5f3da95ff6a0 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -1818,7 +1818,8 @@ static bool dma_ops_domain(struct protection_domain 
*domain)
return domain->flags & PD_DMA_OPS_MASK;
 }
 
-static void set_dte_entry(u16 devid, struct protection_domain *domain, bool 
ats)
+static void set_dte_entry(u16 devid, struct protection_domain *domain,
+ bool ats, bool ppr)
 {
u64 pte_root = 0;
u64 flags = 0;
@@ -1835,6 +1836,13 @@ static void set_dte_entry(u16 devid, struct 
protection_domain *domain, bool ats)
if (ats)
flags |= DTE_FLAG_IOTLB;
 
+   if (ppr) {
+   struct amd_iommu *iommu = amd_iommu_rlookup_table[devid];
+
+   if (iommu_feature(iommu, FEATURE_EPHSUP))
+   pte_root |= 1ULL << DEV_ENTRY_PPR;
+   }
+
if (domain->flags & PD_IOMMUV2_MASK) {
u64 gcr3 = iommu_virt_to_phys(domain->gcr3_tbl);
u64 glx  = domain->glx;
@@ -1897,9 +1905,9 @@ static void do_attach(struct iommu_dev_data *dev_data,
domain->dev_cnt += 1;
 
/* Update device table */
-   set_dte_entry(dev_data->devid, domain, ats);
+   set_dte_entry(dev_data->devid, domain, ats, dev_data->iommu_v2);
if (alias != dev_data->devid)
-   set_dte_entry(alias, domain, ats);
+   set_dte_entry(alias, domain, ats, dev_data->iommu_v2);
 
device_flush_dte(dev_data);
 }
@@ -2278,13 +2286,15 @@ static void update_device_table(struct 
protection_domain *domain)
struct iommu_dev_data *dev_data;
 
list_for_each_entry(dev_data, &domain->dev_list, list) {
-   set_dte_entry(dev_data->devid, domain, dev_data->ats.enabled);
+   set_dte_entry(dev_data->devid, domain, dev_data->ats.enabled,
+ dev_data->iommu_v2);
 
if (dev_data->devid == dev_data->alias)
continue;
 
/* There is an alias, update device table entry for it */
-   set_dte_entry(dev_data->alias, domain, dev_data->ats.enabled);
+   set_dte_entry(dev_data->alias, domain, dev_data->ats.enabled,
+ dev_data->iommu_v2);
}
 }
 
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index f6b24c7d8b70..6a877ebd058b 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -98,6 +98,7 @@
 #define FEATURE_HE (1ULL<<8)
 #define FEATURE_PC (1ULL<<9)
 #define FEATURE_GAM_VAPIC  (1ULL<<21)
+#define FEATURE_EPHSUP (1ULL<<50)
 
 #define FEATURE_PASID_SHIFT32
 #define FEATURE_PASID_MASK (0x1fULL << FEATURE_PASID_SHIFT)
@@ -192,6 +193,7 @@
 /* macros and definitions for device table entries */
 #define DEV_ENTRY_VALID 0x00
 #define DEV_ENTRY_TRANSLATION   0x01
+#define DEV_ENTRY_PPR   0x34
 #define DEV_ENTRY_IR0x3d
 #define DEV_ENTRY_IW0x3e
 #define DEV_ENTRY_NO_PAGE_FAULT0x62



Re: KASAN: use-after-free Read in crypto_aead_free_instance

2017-12-20 Thread Dmitry Vyukov
On Wed, Dec 20, 2017 at 12:49 PM, Stephan Mueller  wrote:
> Am Mittwoch, 20. Dezember 2017, 11:15:38 CET schrieb Dmitry Vyukov:
>
> Hi Dmitry,
>
>>
>> What will be its meaning? How will it differ from fix?
>
> Maybe a short clarification would help: what is the meaning of the syz fix
> marker?

It's described here:
https://github.com/google/syzkaller/blob/master/docs/syzbot.md#bug-status-tracking

> Depending on this answer, all that I am thinking of is to mark bug
> reports for which there are fixes actively discussed, but yet not integrated.
> Thus, such marker should only help others to point them to active discussions
> instead of them trying to find fixes alone.

If it's only for humans, then there is no need to make a special
machine-readable command for this.
So basically what you wrote above is good:

> This issue vanishes after applying the patch "[PATCH v2] crypto: AF_ALG - 
> limit mask and type".

I just didn't understand that's still pending (but perhaps that's what
you meant by including "[PATCH v2]" part).


Re: KASAN: use-after-free Read in crypto_aead_free_instance

2017-12-20 Thread Stephan Mueller
Am Mittwoch, 20. Dezember 2017, 11:15:38 CET schrieb Dmitry Vyukov:

Hi Dmitry,

> 
> What will be its meaning? How will it differ from fix?

Maybe a short clarification would help: what is the meaning of the syz fix 
marker? Depending on this answer, all that I am thinking of is to mark bug 
reports for which there are fixes actively discussed, but yet not integrated. 
Thus, such marker should only help others to point them to active discussions 
instead of them trying to find fixes alone.

Ciao
Stephan


[RFC crypto v3 8/9] chtls: Register the ULP

2017-12-20 Thread Atul Gupta
Add new uld driver for Inline TLS support. Register ULP for chtls.
Setsockopt to program key on chip. support AES GCM key size 128.

Signed-off-by: Atul Gupta 
---
v3: made some functions static
---
 drivers/crypto/chelsio/chtls/chtls_main.c | 584 ++
 include/uapi/linux/tls.h  |   1 +
 2 files changed, 585 insertions(+)
 create mode 100644 drivers/crypto/chelsio/chtls/chtls_main.c

diff --git a/drivers/crypto/chelsio/chtls/chtls_main.c 
b/drivers/crypto/chelsio/chtls/chtls_main.c
new file mode 100644
index 000..fb2d441
--- /dev/null
+++ b/drivers/crypto/chelsio/chtls/chtls_main.c
@@ -0,0 +1,584 @@
+/*
+ * Copyright (c) 2017 Chelsio Communications, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Written by: Atul Gupta (atul.gu...@chelsio.com)
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "chtls.h"
+#include "chtls_cm.h"
+
+#define DRV_NAME "chtls"
+
+/*
+ * chtls device management
+ * maintains a list of the chtls devices
+ */
+static LIST_HEAD(cdev_list);
+static DEFINE_MUTEX(cdev_mutex);
+static DEFINE_MUTEX(cdev_list_lock);
+
+static struct proto chtls_base_prot;
+static struct proto chtls_cpl_prot;
+static DEFINE_MUTEX(notify_mutex);
+static RAW_NOTIFIER_HEAD(listen_notify_list);
+struct request_sock_ops chtls_rsk_ops;
+static uint send_page_order = (14 - PAGE_SHIFT < 0) ? 0 : 14 - PAGE_SHIFT;
+
+static int register_listen_notifier(struct notifier_block *nb)
+{
+   int err;
+
+   mutex_lock(¬ify_mutex);
+   err = raw_notifier_chain_register(&listen_notify_list, nb);
+   mutex_unlock(¬ify_mutex);
+   return err;
+}
+
+static int unregister_listen_notifier(struct notifier_block *nb)
+{
+   int err;
+
+   mutex_lock(¬ify_mutex);
+   err = raw_notifier_chain_unregister(&listen_notify_list, nb);
+   mutex_unlock(¬ify_mutex);
+   return err;
+}
+
+static int listen_notify_handler(struct notifier_block *this,
+unsigned long event, void *data)
+{
+   struct sock *sk = data;
+   struct chtls_dev *cdev;
+   int ret =  NOTIFY_DONE;
+
+   switch (event) {
+   case CHTLS_LISTEN_START:
+   case CHTLS_LISTEN_STOP:
+   mutex_lock(&cdev_list_lock);
+   list_for_each_entry(cdev, &cdev_list, list) {
+   if (event == CHTLS_LISTEN_START)
+   ret = chtls_listen_start(cdev, sk);
+   else
+   chtls_listen_stop(cdev, sk);
+   }
+   mutex_unlock(&cdev_list_lock);
+   break;
+   }
+   return ret;
+}
+
+static struct notifier_block listen_notifier = {
+   .notifier_call = listen_notify_handler
+};
+
+static int listen_backlog_rcv(struct sock *sk, struct sk_buff *skb)
+{
+   if (likely(skb_transport_header(skb) != skb_network_header(skb)))
+   return tcp_v4_do_rcv(sk, skb);
+   BLOG_SKB_CB(skb)->backlog_rcv(sk, skb);
+   return 0;
+}
+
+static int chtls_start_listen(struct sock *sk)
+{
+   int err;
+
+   if (sk->sk_protocol != IPPROTO_TCP)
+   return -EPROTONOSUPPORT;
+
+   if (sk->sk_family == PF_INET &&
+   LOOPBACK(inet_sk(sk)->inet_rcv_saddr))
+   return -EADDRNOTAVAIL;
+
+   sk->sk_backlog_rcv = listen_backlog_rcv;
+   mutex_lock(¬ify_mutex);
+   err = raw_notifier_call_chain(&listen_notify_list, 0, sk);
+   mutex_unlock(¬ify_mutex);
+   return err;
+}
+
+static int chtls_hash(struct sock *sk)
+{
+   int err;
+
+   err = tcp_prot.hash(sk);
+   if (sk->sk_state == TCP_LISTEN)
+   err |= chtls_start_listen(sk);
+
+   if (err)
+   tcp_prot.unhash(sk);
+   return err;
+}
+
+static int chtls_stop_listen(struct sock *sk)
+{
+   if (sk->sk_protocol != IPPROTO_TCP)
+   return -EPROTONOSUPPORT;
+
+   mutex_lock(¬ify_mutex);
+   raw_notifier_call_chain(&listen_notify_list, 1, sk);
+   mutex_unlock(¬ify_mutex);
+   return 0;
+}
+
+static void chtls_unhash(struct sock *sk)
+{
+   if (sk->sk_state == TCP_LISTEN)
+   chtls_stop_listen(sk);
+   tcp_prot.unhash(sk);
+}
+
+static void chtls_lsk_close(struct sock *sk, long timeout)
+{
+   struct tls_context *ctx = tls_get_ctx(sk);
+   void (*sk_proto_close)(struct sock *sk, long timeout);
+
+   lock_sock(sk);
+   sk_proto_close = ctx->sk_proto_close;
+   kfree(ctx);
+
+   release_sock(sk);
+   sk_proto_close(sk, timeout);
+}
+
+static void process_deferq(struct work_struct *task_param)
+{
+   struct chtls_dev *cdev = container_of(task_param,
+   struct chtls_dev, deferq_task);
+   struct sk_buff *sk

[RFC crypto v3 9/9] Makefile Kconfig

2017-12-20 Thread Atul Gupta
Entry for Inline TLS as another driver dependent on cxgb4 and chcr

Signed-off-by: Atul Gupta 
---
 drivers/crypto/chelsio/Kconfig| 10 ++
 drivers/crypto/chelsio/Makefile   |  1 +
 drivers/crypto/chelsio/chtls/Makefile |  4 
 3 files changed, 15 insertions(+)
 create mode 100644 drivers/crypto/chelsio/chtls/Makefile

diff --git a/drivers/crypto/chelsio/Kconfig b/drivers/crypto/chelsio/Kconfig
index 51932c7..686d246 100644
--- a/drivers/crypto/chelsio/Kconfig
+++ b/drivers/crypto/chelsio/Kconfig
@@ -28,3 +28,13 @@ config CHELSIO_IPSEC_INLINE
 default n
 ---help---
   Enable support for IPSec Tx Inline.
+
+config CRYPTO_DEV_CHELSIO_TLS
+tristate "Chelsio Crypto Inline TLS Driver"
+depends on CHELSIO_T4
+select CRYPTO_DEV_CHELSIO
+---help---
+  Support Chelsio Inline TLS with Chelsio crypto accelerator.
+
+  To compile this driver as a module, choose M here: the module
+  will be called chtls.
diff --git a/drivers/crypto/chelsio/Makefile b/drivers/crypto/chelsio/Makefile
index eaecaf1..639e571 100644
--- a/drivers/crypto/chelsio/Makefile
+++ b/drivers/crypto/chelsio/Makefile
@@ -3,3 +3,4 @@ ccflags-y := -Idrivers/net/ethernet/chelsio/cxgb4
 obj-$(CONFIG_CRYPTO_DEV_CHELSIO) += chcr.o
 chcr-objs :=  chcr_core.o chcr_algo.o
 chcr-$(CONFIG_CHELSIO_IPSEC_INLINE) += chcr_ipsec.o
+obj-$(CONFIG_CRYPTO_DEV_CHELSIO_TLS) += chtls/
diff --git a/drivers/crypto/chelsio/chtls/Makefile 
b/drivers/crypto/chelsio/chtls/Makefile
new file mode 100644
index 000..df13795
--- /dev/null
+++ b/drivers/crypto/chelsio/chtls/Makefile
@@ -0,0 +1,4 @@
+ccflags-y := -Idrivers/net/ethernet/chelsio/cxgb4 -Idrivers/crypto/chelsio/
+
+obj-$(CONFIG_CRYPTO_DEV_CHELSIO_TLS) += chtls.o
+chtls-objs := chtls_main.o chtls_cm.o chtls_io.o chtls_hw.o
-- 
1.8.3.1



[RFC crypto v3 7/9] chtls: Inline crypto request Tx/Rx

2017-12-20 Thread Atul Gupta
TLS handler for record transmit and receive.
Create Inline TLS work request and post to FW.

Signed-off-by: Atul Gupta 
---
v3: made some functions static and initialized few variables
---
 drivers/crypto/chelsio/chtls/chtls_io.c | 1867 +++
 1 file changed, 1867 insertions(+)
 create mode 100644 drivers/crypto/chelsio/chtls/chtls_io.c

diff --git a/drivers/crypto/chelsio/chtls/chtls_io.c 
b/drivers/crypto/chelsio/chtls/chtls_io.c
new file mode 100644
index 000..a0f03fb
--- /dev/null
+++ b/drivers/crypto/chelsio/chtls/chtls_io.c
@@ -0,0 +1,1867 @@
+/*
+ * Copyright (c) 2017 Chelsio Communications, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Written by: Atul Gupta (atul.gu...@chelsio.com)
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "chtls.h"
+#include "chtls_cm.h"
+
+static bool is_tls_hw(struct chtls_sock *csk)
+{
+   return csk->tlshws.ofld;
+}
+
+static bool is_tls_rx(struct chtls_sock *csk)
+{
+   return (csk->tlshws.rxkey >= 0);
+}
+
+static bool is_tls_tx(struct chtls_sock *csk)
+{
+   return (csk->tlshws.txkey >= 0);
+}
+
+static bool is_tls_skb(struct chtls_sock *csk, const struct sk_buff *skb)
+{
+   return (is_tls_hw(csk) && skb_ulp_tls_skb_flags(skb));
+}
+
+static int key_size(void *sk)
+{
+   return 16; /* Key on DDR */
+}
+
+#define ceil(x, y) \
+   ({ unsigned long __x = (x), __y = (y); (__x + __y - 1) / __y; })
+
+static int data_sgl_len(const struct sk_buff *skb)
+{
+   unsigned int cnt;
+
+   cnt = skb_shinfo(skb)->nr_frags;
+   return (sgl_len(cnt) * 8);
+}
+
+static int nos_ivs(struct sock *sk, unsigned int size)
+{
+   struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
+
+   return ceil(size, csk->tlshws.mfs);
+}
+
+#define TLS_WR_CPL_LEN \
+   (sizeof(struct fw_tlstx_data_wr) + \
+   sizeof(struct cpl_tx_tls_sfo))
+
+static int is_ivs_imm(struct sock *sk, const struct sk_buff *skb)
+{
+   int ivs_size = nos_ivs(sk, skb->len) * CIPHER_BLOCK_SIZE;
+   int hlen = TLS_WR_CPL_LEN + data_sgl_len(skb);
+
+   if ((hlen + key_size(sk) + ivs_size) <
+   MAX_IMM_OFLD_TX_DATA_WR_LEN) {
+   ULP_SKB_CB(skb)->ulp.tls.iv = 1;
+   return 1;
+   }
+   ULP_SKB_CB(skb)->ulp.tls.iv = 0;
+   return 0;
+}
+
+static int max_ivs_size(struct sock *sk, int size)
+{
+   return (nos_ivs(sk, size) * CIPHER_BLOCK_SIZE);
+}
+
+static int ivs_size(struct sock *sk, const struct sk_buff *skb)
+{
+   return (is_ivs_imm(sk, skb) ? (nos_ivs(sk, skb->len) *
+CIPHER_BLOCK_SIZE) : 0);
+}
+
+static int flowc_wr_credits(int nparams, int *flowclenp)
+{
+   int flowclen16, flowclen;
+
+   flowclen = offsetof(struct fw_flowc_wr, mnemval[nparams]);
+   flowclen16 = DIV_ROUND_UP(flowclen, 16);
+   flowclen = flowclen16 * 16;
+
+   if (flowclenp)
+   *flowclenp = flowclen;
+
+   return flowclen16;
+}
+
+static struct sk_buff *create_flowc_wr_skb(struct sock *sk,
+  struct fw_flowc_wr *flowc,
+  int flowclen)
+{
+   struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
+   struct sk_buff *skb;
+
+   skb = alloc_skb(flowclen, GFP_ATOMIC);
+   if (!skb)
+   return NULL;
+
+   memcpy(__skb_put(skb, flowclen), flowc, flowclen);
+   set_queue(skb, (csk->txq_idx << 1) | CPL_PRIORITY_DATA, sk);
+
+   return skb;
+}
+
+static int send_flowc_wr(struct sock *sk, struct fw_flowc_wr *flowc,
+int flowclen)
+{
+   struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
+   bool syn_sent = (sk->sk_state == TCP_SYN_SENT);
+   struct tcp_sock *tp = tcp_sk(sk);
+   int flowclen16 = flowclen / 16;
+   struct sk_buff *skb;
+
+   if (csk_flag(sk, CSK_TX_DATA_SENT)) {
+   skb = create_flowc_wr_skb(sk, flowc, flowclen);
+   if (!skb)
+   return -ENOMEM;
+
+   if (syn_sent)
+   __skb_queue_tail(&csk->ooo_queue, skb);
+   else
+   skb_entail(sk, skb,
+  ULPCB_FLAG_NO_HDR | ULPCB_FLAG_NO_APPEND);
+   return 0;
+   }
+
+   if (!syn_sent) {
+   int ret;
+
+   ret = cxgb4_immdata_send(csk->egress_dev,
+csk->txq_idx,
+flowc, flowclen);
+   if (!ret)
+   return flowclen16;
+   }
+   skb = create_flowc_wr_skb(sk, flowc, flowclen);
+   if (!skb)
+   return -ENOMEM;
+   send_or_defer(sk, tp, skb, 0)

[RFC crypto v3 6/9] chtls: CPL handler definition

2017-12-20 Thread Atul Gupta
CPL handlers for TLS session, record transmit and receive.

Signed-off-by: Atul Gupta 
---
v3: made some functions static and removed un-needed semicolon
---
 drivers/crypto/chelsio/chtls/chtls_cm.c | 2045 +++
 net/ipv4/tcp_minisocks.c|1 +
 2 files changed, 2046 insertions(+)
 create mode 100644 drivers/crypto/chelsio/chtls/chtls_cm.c

diff --git a/drivers/crypto/chelsio/chtls/chtls_cm.c 
b/drivers/crypto/chelsio/chtls/chtls_cm.c
new file mode 100644
index 000..dac3c3a
--- /dev/null
+++ b/drivers/crypto/chelsio/chtls/chtls_cm.c
@@ -0,0 +1,2045 @@
+/*
+ * Copyright (c) 2017 Chelsio Communications, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Written by: Atul Gupta (atul.gu...@chelsio.com)
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "chtls.h"
+#include "chtls_cm.h"
+
+extern struct request_sock_ops chtls_rsk_ops;
+
+/*
+ * State transitions and actions for close.  Note that if we are in SYN_SENT
+ * we remain in that state as we cannot control a connection while it's in
+ * SYN_SENT; such connections are allowed to establish and are then aborted.
+ */
+static unsigned char new_state[16] = {
+   /* current state: new state:  action: */
+   /* (Invalid)   */ TCP_CLOSE,
+   /* TCP_ESTABLISHED */ TCP_FIN_WAIT1 | TCP_ACTION_FIN,
+   /* TCP_SYN_SENT*/ TCP_SYN_SENT,
+   /* TCP_SYN_RECV*/ TCP_FIN_WAIT1 | TCP_ACTION_FIN,
+   /* TCP_FIN_WAIT1   */ TCP_FIN_WAIT1,
+   /* TCP_FIN_WAIT2   */ TCP_FIN_WAIT2,
+   /* TCP_TIME_WAIT   */ TCP_CLOSE,
+   /* TCP_CLOSE   */ TCP_CLOSE,
+   /* TCP_CLOSE_WAIT  */ TCP_LAST_ACK | TCP_ACTION_FIN,
+   /* TCP_LAST_ACK*/ TCP_LAST_ACK,
+   /* TCP_LISTEN  */ TCP_CLOSE,
+   /* TCP_CLOSING */ TCP_CLOSING,
+};
+
+static struct chtls_sock *chtls_sock_create(struct chtls_dev *cdev)
+{
+   struct chtls_sock *csk = kzalloc(sizeof(*csk), GFP_ATOMIC);
+
+   if (!csk)
+   return NULL;
+
+   csk->txdata_skb_cache = alloc_skb(TXDATA_SKB_LEN, GFP_ATOMIC);
+   if (!csk->txdata_skb_cache) {
+   kfree(csk);
+   return NULL;
+   }
+
+   kref_init(&csk->kref);
+   csk->cdev = cdev;
+   skb_queue_head_init(&csk->txq);
+   csk->wr_skb_head = NULL;
+   csk->wr_skb_tail = NULL;
+   csk->mss = MAX_MSS;
+   csk->tlshws.ofld = 1;
+   csk->tlshws.txkey = -1;
+   csk->tlshws.rxkey = -1;
+   csk->tlshws.mfs = TLS_MFS;
+   skb_queue_head_init(&csk->tlshws.sk_recv_queue);
+   return csk;
+}
+
+static void chtls_sock_release(struct kref *ref)
+{
+   struct chtls_sock *csk =
+   container_of(ref, struct chtls_sock, kref);
+
+   kfree(csk);
+}
+
+static struct net_device *chtls_ipv4_netdev(struct chtls_dev *cdev,
+   struct sock *sk)
+{
+   struct net_device *ndev = cdev->ports[0];
+
+   if (likely(!inet_sk(sk)->inet_rcv_saddr))
+   return ndev;
+
+   ndev = ip_dev_find(&init_net, inet_sk(sk)->inet_rcv_saddr);
+   if (!ndev)
+   return NULL;
+
+   if (is_vlan_dev(ndev))
+   return vlan_dev_real_dev(ndev);
+   return ndev;
+}
+
+static void assign_rxopt(struct sock *sk, unsigned int opt)
+{
+   struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
+   struct tcp_sock *tp = tcp_sk(sk);
+   const struct chtls_dev *cdev;
+
+   cdev = csk->cdev;
+   tp->tcp_header_len   = sizeof(struct tcphdr);
+   tp->rx_opt.mss_clamp = cdev->mtus[TCPOPT_MSS_G(opt)] - 40;
+   tp->mss_cache= tp->rx_opt.mss_clamp;
+   tp->rx_opt.tstamp_ok = TCPOPT_TSTAMP_G(opt);
+   tp->rx_opt.snd_wscale= TCPOPT_SACK_G(opt);
+   tp->rx_opt.wscale_ok = TCPOPT_WSCALE_OK_G(opt);
+   SND_WSCALE(tp)   = TCPOPT_SND_WSCALE_G(opt);
+   if (!tp->rx_opt.wscale_ok)
+   tp->rx_opt.rcv_wscale = 0;
+   if (tp->rx_opt.tstamp_ok) {
+   tp->tcp_header_len += TCPOLEN_TSTAMP_ALIGNED;
+   tp->rx_opt.mss_clamp -= TCPOLEN_TSTAMP_ALIGNED;
+   } else if (csk->opt2 & TSTAMPS_EN_F) {
+   csk->opt2 &= ~TSTAMPS_EN_F;
+   csk->mtu_idx = TCPOPT_MSS_G(opt);
+   }
+}
+
+static void chtls_purge_rcv_queue(struct sock *sk)
+{
+   struct sk_buff *skb;
+
+   while ((skb = __skb_dequeue(&sk->sk_receive_queue)) != NULL) {
+   skb_dst_set(skb, (void *)NULL);
+   kfree_skb(skb);
+   }
+}
+
+static void chtls_purge_write_queue(struct sock *sk)
+{
+   struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
+   struct 

[RFC crypto v3 3/9] cxgb4: LLD driver changes to enable TLS

2017-12-20 Thread Atul Gupta
Read FW capability. Read key area size. Dump the TLS record count.

Signed-off-by: Atul Gupta 
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c | 18 +++-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c| 32 +--
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h |  7 ++
 drivers/net/ethernet/chelsio/cxgb4/sge.c   | 98 +-
 4 files changed, 142 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c 
b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
index cf47183..cfc9210 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c
@@ -2826,8 +2826,8 @@ static int meminfo_show(struct seq_file *seq, void *v)
"Tx payload:", "Rx payload:", "LE hash:", "iSCSI region:",
"TDDP region:", "TPT region:", "STAG region:", "RQ region:",
"RQUDP region:", "PBL region:", "TXPBL region:",
-   "DBVFIFO region:", "ULPRX state:", "ULPTX state:",
-   "On-chip queues:"
+   "TLSKey region:", "DBVFIFO region:", "ULPRX state:",
+   "ULPTX state:", "On-chip queues:"
};
 
int i, n;
@@ -2943,6 +2943,12 @@ static int meminfo_show(struct seq_file *seq, void *v)
ulp_region(RX_RQUDP);
ulp_region(RX_PBL);
ulp_region(TX_PBL);
+   if (adap->params.crypto & FW_CAPS_CONFIG_TLS_INLINE) {
+   ulp_region(RX_TLS_KEY);
+   } else {
+   md->base = 0;
+   md->idx = ARRAY_SIZE(region);
+   }
 #undef ulp_region
md->base = 0;
md->idx = ARRAY_SIZE(region);
@@ -3098,6 +3104,14 @@ static int chcr_show(struct seq_file *seq, void *v)
   atomic_read(&adap->chcr_stats.fallback));
seq_printf(seq, "IPSec PDU: %10u\n",
   atomic_read(&adap->chcr_stats.ipsec_cnt));
+
+   seq_puts(seq, "\nChelsio Inline TLS Stats\n");
+   seq_printf(seq, "TLS PDU Tx: %u\n",
+  atomic_read(&adap->chcr_stats.tls_pdu_tx));
+   seq_printf(seq, "TLS PDU Rx: %u\n",
+  atomic_read(&adap->chcr_stats.tls_pdu_rx));
+   seq_printf(seq, "TLS Keys (DDR) Count: %u\n",
+  atomic_read(&adap->chcr_stats.tls_key));
return 0;
 }
 
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c 
b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 05a4abf..60eb18b 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -4086,18 +4086,32 @@ static int adap_init0(struct adapter *adap)
adap->num_ofld_uld += 2;
}
if (caps_cmd.cryptocaps) {
-   /* Should query params here...TODO */
-   params[0] = FW_PARAM_PFVF(NCRYPTO_LOOKASIDE);
-   ret = t4_query_params(adap, adap->mbox, adap->pf, 0, 2,
- params, val);
-   if (ret < 0) {
-   if (ret != -EINVAL)
+   if (ntohs(caps_cmd.cryptocaps) &
+   FW_CAPS_CONFIG_CRYPTO_LOOKASIDE) {
+   params[0] = FW_PARAM_PFVF(NCRYPTO_LOOKASIDE);
+   ret = t4_query_params(adap, adap->mbox, adap->pf, 0,
+ 2, params, val);
+   if (ret < 0) {
+   if (ret != -EINVAL)
+   goto bye;
+   } else {
+   adap->vres.ncrypto_fc = val[0];
+   }
+   adap->num_ofld_uld += 1;
+   }
+   if (ntohs(caps_cmd.cryptocaps) &
+   FW_CAPS_CONFIG_TLS_INLINE) {
+   params[0] = FW_PARAM_PFVF(TLS_START);
+   params[1] = FW_PARAM_PFVF(TLS_END);
+   ret = t4_query_params(adap, adap->mbox, adap->pf, 0,
+ 2, params, val);
+   if (ret < 0)
goto bye;
-   } else {
-   adap->vres.ncrypto_fc = val[0];
+   adap->vres.key.start = val[0];
+   adap->vres.key.size = val[1] - val[0] + 1;
+   adap->num_uld += 1;
}
adap->params.crypto = ntohs(caps_cmd.cryptocaps);
-   adap->num_uld += 1;
}
 #undef FW_PARAM_PFVF
 #undef FW_PARAM_DEV
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h 
b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
index 1d37672..55863f6 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
@@ -237,6 +237,7 @@ enum cxgb4_uld {
CXGB4_ULD_ISCSI,
CXGB4_ULD_ISCSIT,
CXGB4_ULD_CRYPTO,
+   CXGB4_ULD_TLS,
CXGB4_ULD_MAX
 };
 
@@ -287,6 +288,7 @@ struct cxgb4_virt_

[RFC crypto v3 4/9] chcr: Key Macro

2017-12-20 Thread Atul Gupta
Define macro for TLS Key context

Signed-off-by: Atul Gupta 
---
 drivers/crypto/chelsio/chcr_algo.h | 42 +
 drivers/crypto/chelsio/chcr_core.h | 55 +-
 2 files changed, 96 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/chelsio/chcr_algo.h 
b/drivers/crypto/chelsio/chcr_algo.h
index d1673a5..f263cd4 100644
--- a/drivers/crypto/chelsio/chcr_algo.h
+++ b/drivers/crypto/chelsio/chcr_algo.h
@@ -86,6 +86,39 @@
 KEY_CONTEXT_OPAD_PRESENT_M)
 #define KEY_CONTEXT_OPAD_PRESENT_F  KEY_CONTEXT_OPAD_PRESENT_V(1U)
 
+#define TLS_KEYCTX_RXFLIT_CNT_S 24
+#define TLS_KEYCTX_RXFLIT_CNT_V(x) ((x) << TLS_KEYCTX_RXFLIT_CNT_S)
+
+#define TLS_KEYCTX_RXPROT_VER_S 20
+#define TLS_KEYCTX_RXPROT_VER_M 0xf
+#define TLS_KEYCTX_RXPROT_VER_V(x) ((x) << TLS_KEYCTX_RXPROT_VER_S)
+
+#define TLS_KEYCTX_RXCIPH_MODE_S 16
+#define TLS_KEYCTX_RXCIPH_MODE_M 0xf
+#define TLS_KEYCTX_RXCIPH_MODE_V(x) ((x) << TLS_KEYCTX_RXCIPH_MODE_S)
+
+#define TLS_KEYCTX_RXAUTH_MODE_S 12
+#define TLS_KEYCTX_RXAUTH_MODE_M 0xf
+#define TLS_KEYCTX_RXAUTH_MODE_V(x) ((x) << TLS_KEYCTX_RXAUTH_MODE_S)
+
+#define TLS_KEYCTX_RXCIAU_CTRL_S 11
+#define TLS_KEYCTX_RXCIAU_CTRL_V(x) ((x) << TLS_KEYCTX_RXCIAU_CTRL_S)
+
+#define TLS_KEYCTX_RX_SEQCTR_S 9
+#define TLS_KEYCTX_RX_SEQCTR_M 0x3
+#define TLS_KEYCTX_RX_SEQCTR_V(x) ((x) << TLS_KEYCTX_RX_SEQCTR_S)
+
+#define TLS_KEYCTX_RX_VALID_S 8
+#define TLS_KEYCTX_RX_VALID_V(x) ((x) << TLS_KEYCTX_RX_VALID_S)
+
+#define TLS_KEYCTX_RXCK_SIZE_S 3
+#define TLS_KEYCTX_RXCK_SIZE_M 0x7
+#define TLS_KEYCTX_RXCK_SIZE_V(x) ((x) << TLS_KEYCTX_RXCK_SIZE_S)
+
+#define TLS_KEYCTX_RXMK_SIZE_S 0
+#define TLS_KEYCTX_RXMK_SIZE_M 0x7
+#define TLS_KEYCTX_RXMK_SIZE_V(x) ((x) << TLS_KEYCTX_RXMK_SIZE_S)
+
 #define CHCR_HASH_MAX_DIGEST_SIZE 64
 #define CHCR_MAX_SHA_DIGEST_SIZE 64
 
@@ -176,6 +209,15 @@
  KEY_CONTEXT_SALT_PRESENT_V(1) | \
  KEY_CONTEXT_CTX_LEN_V((ctx_len)))
 
+#define  FILL_KEY_CRX_HDR(ck_size, mk_size, d_ck, opad, ctx_len) \
+   htonl(TLS_KEYCTX_RXMK_SIZE_V(mk_size) | \
+ TLS_KEYCTX_RXCK_SIZE_V(ck_size) | \
+ TLS_KEYCTX_RX_VALID_V(1) | \
+ TLS_KEYCTX_RX_SEQCTR_V(3) | \
+ TLS_KEYCTX_RXAUTH_MODE_V(4) | \
+ TLS_KEYCTX_RXCIPH_MODE_V(2) | \
+ TLS_KEYCTX_RXFLIT_CNT_V((ctx_len)))
+
 #define FILL_WR_OP_CCTX_SIZE \
htonl( \
FW_CRYPTO_LOOKASIDE_WR_OPCODE_V( \
diff --git a/drivers/crypto/chelsio/chcr_core.h 
b/drivers/crypto/chelsio/chcr_core.h
index 3c29ee0..77056a9 100644
--- a/drivers/crypto/chelsio/chcr_core.h
+++ b/drivers/crypto/chelsio/chcr_core.h
@@ -65,10 +65,58 @@
 struct _key_ctx {
__be32 ctx_hdr;
u8 salt[MAX_SALT];
-   __be64 reserverd;
+   __be64 iv_to_auth;
unsigned char key[0];
 };
 
+#define KEYCTX_TX_WR_IV_S  55
+#define KEYCTX_TX_WR_IV_M  0x1ffULL
+#define KEYCTX_TX_WR_IV_V(x) ((x) << KEYCTX_TX_WR_IV_S)
+#define KEYCTX_TX_WR_IV_G(x) \
+   (((x) >> KEYCTX_TX_WR_IV_S) & KEYCTX_TX_WR_IV_M)
+
+#define KEYCTX_TX_WR_AAD_S 47
+#define KEYCTX_TX_WR_AAD_M 0xffULL
+#define KEYCTX_TX_WR_AAD_V(x) ((x) << KEYCTX_TX_WR_AAD_S)
+#define KEYCTX_TX_WR_AAD_G(x) (((x) >> KEYCTX_TX_WR_AAD_S) & \
+   KEYCTX_TX_WR_AAD_M)
+
+#define KEYCTX_TX_WR_AADST_S 39
+#define KEYCTX_TX_WR_AADST_M 0xffULL
+#define KEYCTX_TX_WR_AADST_V(x) ((x) << KEYCTX_TX_WR_AADST_S)
+#define KEYCTX_TX_WR_AADST_G(x) \
+   (((x) >> KEYCTX_TX_WR_AADST_S) & KEYCTX_TX_WR_AADST_M)
+
+#define KEYCTX_TX_WR_CIPHER_S 30
+#define KEYCTX_TX_WR_CIPHER_M 0x1ffULL
+#define KEYCTX_TX_WR_CIPHER_V(x) ((x) << KEYCTX_TX_WR_CIPHER_S)
+#define KEYCTX_TX_WR_CIPHER_G(x) \
+   (((x) >> KEYCTX_TX_WR_CIPHER_S) & KEYCTX_TX_WR_CIPHER_M)
+
+#define KEYCTX_TX_WR_CIPHERST_S 23
+#define KEYCTX_TX_WR_CIPHERST_M 0x7f
+#define KEYCTX_TX_WR_CIPHERST_V(x) ((x) << KEYCTX_TX_WR_CIPHERST_S)
+#define KEYCTX_TX_WR_CIPHERST_G(x) \
+   (((x) >> KEYCTX_TX_WR_CIPHERST_S) & KEYCTX_TX_WR_CIPHERST_M)
+
+#define KEYCTX_TX_WR_AUTH_S 14
+#define KEYCTX_TX_WR_AUTH_M 0x1ff
+#define KEYCTX_TX_WR_AUTH_V(x) ((x) << KEYCTX_TX_WR_AUTH_S)
+#define KEYCTX_TX_WR_AUTH_G(x) \
+   (((x) >> KEYCTX_TX_WR_AUTH_S) & KEYCTX_TX_WR_AUTH_M)
+
+#define KEYCTX_TX_WR_AUTHST_S 7
+#define KEYCTX_TX_WR_AUTHST_M 0x7f
+#define KEYCTX_TX_WR_AUTHST_V(x) ((x) << KEYCTX_TX_WR_AUTHST_S)
+#define KEYCTX_TX_WR_AUTHST_G(x) \
+   (((x) >> KEYCTX_TX_WR_AUTHST_S) & KEYCTX_TX_WR_AUTHST_M)
+
+#define KEYCTX_TX_WR_AUTHIN_S 0
+#define KEYCTX_TX_WR_AUTHIN_M 0x7f
+#define KEYCTX_TX_WR_AUTHIN_V(x) ((x) << KEYCTX_TX_WR_AUTHIN_S)
+#define KEYCTX_TX_WR_AUTHIN_G(x) \
+   (((x) >> KEYCTX_TX_WR_AUTHIN_S) & KEYCTX_TX_WR_AUTHIN_M)
+
 struct chcr_wr {
struct fw_crypto_lookaside_wr wreq;
struct ulp_txpkt ulptx;
@@ -90,6 +138,11 @@ struct uld_ctx {
struct chcr_dev *dev;
 };
 
+struct sge_opaq

[RFC crypto v3 5/9] chtls: Key program

2017-12-20 Thread Atul Gupta
Program the tx and rx key on chip.

Signed-off-by: Atul Gupta 
---
v3: made some functions static
---
 drivers/crypto/chelsio/chtls/chtls_hw.c | 394 
 1 file changed, 394 insertions(+)
 create mode 100644 drivers/crypto/chelsio/chtls/chtls_hw.c

diff --git a/drivers/crypto/chelsio/chtls/chtls_hw.c 
b/drivers/crypto/chelsio/chtls/chtls_hw.c
new file mode 100644
index 000..c3e17159
--- /dev/null
+++ b/drivers/crypto/chelsio/chtls/chtls_hw.c
@@ -0,0 +1,394 @@
+/*
+ * Copyright (c) 2017 Chelsio Communications, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Written by: Atul Gupta (atul.gu...@chelsio.com)
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "chtls.h"
+#include "chtls_cm.h"
+
+static void __set_tcb_field_direct(struct chtls_sock *csk,
+  struct cpl_set_tcb_field *req, u16 word,
+  u64 mask, u64 val, u8 cookie, int no_reply)
+{
+   struct ulptx_idata *sc;
+
+   INIT_TP_WR_CPL(req, CPL_SET_TCB_FIELD, csk->tid);
+   req->wr.wr_mid |= htonl(FW_WR_FLOWID_V(csk->tid));
+   req->reply_ctrl = htons(NO_REPLY_V(no_reply) |
+   QUEUENO_V(csk->rss_qid));
+   req->word_cookie = htons(TCB_WORD_V(word) | TCB_COOKIE_V(cookie));
+   req->mask = cpu_to_be64(mask);
+   req->val = cpu_to_be64(val);
+   sc = (struct ulptx_idata *)(req + 1);
+   sc->cmd_more = htonl(ULPTX_CMD_V(ULP_TX_SC_NOOP));
+   sc->len = htonl(0);
+}
+
+static void __set_tcb_field(struct sock *sk, struct sk_buff *skb, u16 word,
+   u64 mask, u64 val, u8 cookie, int no_reply)
+{
+   struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
+   struct cpl_set_tcb_field *req;
+   struct ulptx_idata *sc;
+   unsigned int wrlen = roundup(sizeof(*req) + sizeof(*sc), 16);
+
+   req = (struct cpl_set_tcb_field *)__skb_put(skb, wrlen);
+   __set_tcb_field_direct(csk, req, word, mask, val, cookie, no_reply);
+   set_wr_txq(skb, CPL_PRIORITY_CONTROL, csk->port_id);
+}
+
+static int chtls_set_tcb_field(struct sock *sk, u16 word, u64 mask, u64 val)
+{
+   struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
+   struct sk_buff *skb;
+   struct cpl_set_tcb_field *req;
+   struct ulptx_idata *sc;
+   unsigned int wrlen = roundup(sizeof(*req) + sizeof(*sc), 16);
+   unsigned int credits_needed = DIV_ROUND_UP(wrlen, 16);
+
+   skb = alloc_skb(wrlen, GFP_ATOMIC);
+   if (!skb)
+   return -ENOMEM;
+
+   __set_tcb_field(sk, skb, word, mask, val, 0, 1);
+   set_queue(skb, (csk->txq_idx << 1) | CPL_PRIORITY_DATA, sk);
+   csk->wr_credits -= credits_needed;
+   csk->wr_unacked += credits_needed;
+   enqueue_wr(csk, skb);
+   cxgb4_ofld_send(csk->egress_dev, skb);
+   return 0;
+}
+
+/*
+ * Set one of the t_flags bits in the TCB.
+ */
+int chtls_set_tcb_tflag(struct sock *sk, unsigned int bit_pos, int val)
+{
+   return chtls_set_tcb_field(sk, 1, 1ULL << bit_pos,
+   val << bit_pos);
+}
+
+static int chtls_set_tcb_keyid(struct sock *sk, int keyid)
+{
+   return chtls_set_tcb_field(sk, 31, 0xULL, keyid);
+}
+
+static int chtls_set_tcb_seqno(struct sock *sk)
+{
+   return chtls_set_tcb_field(sk, 28, ~0ULL, 0);
+}
+
+static int chtls_set_tcb_quiesce(struct sock *sk, int val)
+{
+   return chtls_set_tcb_field(sk, 1, (1ULL << TF_RX_QUIESCE_S),
+  TF_RX_QUIESCE_V(val));
+}
+
+static void *chtls_alloc_mem(unsigned long size)
+{
+   void *p = kmalloc(size, GFP_KERNEL);
+
+   if (!p)
+   p = vmalloc(size);
+   if (p)
+   memset(p, 0, size);
+   return p;
+}
+
+static void chtls_free_mem(void *addr)
+{
+   unsigned long p = (unsigned long)addr;
+
+   if (p >= VMALLOC_START && p < VMALLOC_END)
+   vfree(addr);
+   else
+   kfree(addr);
+}
+
+/* TLS Key bitmap processing */
+int chtls_init_kmap(struct chtls_dev *cdev, struct cxgb4_lld_info *lldi)
+{
+   unsigned int num_key_ctx, bsize;
+
+   num_key_ctx = (lldi->vr->key.size / TLS_KEY_CONTEXT_SZ);
+   bsize = BITS_TO_LONGS(num_key_ctx);
+
+   cdev->kmap.size = num_key_ctx;
+   cdev->kmap.available = bsize;
+   cdev->kmap.addr = chtls_alloc_mem(sizeof(*cdev->kmap.addr) *
+ bsize);
+   if (!cdev->kmap.addr)
+   return -1;
+
+   cdev->kmap.start = lldi->vr->key.start;
+   spin_lock_init(&cdev->kmap.lock);
+   return 0;
+}
+
+void chtls_free_kmap(struct chtls_dev *cdev)
+{
+   if (cdev->kmap.addr)
+   chtls_free_mem(cdev->kmap.addr);
+}
+
+st

[RFC crypto v3 2/9] cxgb4: Inline TLS FW Interface

2017-12-20 Thread Atul Gupta
Key area size in hw-config file. CPL struct for TLS request
and response. Work request for Inline TLS.

Signed-off-by: Atul Gupta 
---
 drivers/net/ethernet/chelsio/cxgb4/t4_msg.h   | 121 ++-
 drivers/net/ethernet/chelsio/cxgb4/t4_regs.h  |   2 +
 drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h | 165 +-
 3 files changed, 283 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_msg.h 
b/drivers/net/ethernet/chelsio/cxgb4/t4_msg.h
index 7e12f24..9a56e0d 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_msg.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_msg.h
@@ -81,6 +81,7 @@ enum {
CPL_RX_ISCSI_CMP  = 0x45,
CPL_TRACE_PKT_T5  = 0x48,
CPL_RX_ISCSI_DDP  = 0x49,
+   CPL_RX_TLS_CMP= 0x4E,
 
CPL_RDMA_READ_REQ = 0x60,
 
@@ -88,6 +89,7 @@ enum {
CPL_ACT_OPEN_REQ6 = 0x83,
 
CPL_TX_TLS_PDU =0x88,
+   CPL_TX_TLS_SFO= 0x89,
CPL_TX_SEC_PDU= 0x8A,
CPL_TX_TLS_ACK= 0x8B,
 
@@ -97,6 +99,7 @@ enum {
CPL_RX_MPS_PKT= 0xAF,
 
CPL_TRACE_PKT = 0xB0,
+   CPL_TLS_DATA  = 0xB1,
CPL_ISCSI_DATA= 0xB2,
 
CPL_FW4_MSG   = 0xC0,
@@ -150,6 +153,7 @@ enum {
ULP_MODE_RDMA  = 4,
ULP_MODE_TCPDDP= 5,
ULP_MODE_FCOE  = 6,
+   ULP_MODE_TLS   = 8,
 };
 
 enum {
@@ -1414,6 +1418,14 @@ struct cpl_tx_data {
 #define TX_FORCE_S 13
 #define TX_FORCE_V(x)  ((x) << TX_FORCE_S)
 
+#define TX_SHOVE_S14
+#define TX_SHOVE_V(x) ((x) << TX_SHOVE_S)
+
+#define TX_ULP_MODE_S10
+#define TX_ULP_MODE_M0x7
+#define TX_ULP_MODE_V(x) ((x) << TX_ULP_MODE_S)
+#define TX_ULP_MODE_G(x) (((x) >> TX_ULP_MODE_S) & TX_ULP_MODE_M)
+
 #define T6_TX_FORCE_S  20
 #define T6_TX_FORCE_V(x)   ((x) << T6_TX_FORCE_S)
 #define T6_TX_FORCE_F  T6_TX_FORCE_V(1U)
@@ -1428,12 +1440,21 @@ enum {
ULP_TX_SC_NOOP = 0x80,
ULP_TX_SC_IMM  = 0x81,
ULP_TX_SC_DSGL = 0x82,
-   ULP_TX_SC_ISGL = 0x83
+   ULP_TX_SC_ISGL = 0x83,
+   ULP_TX_SC_MEMRD = 0x86
 };
 
 #define ULPTX_CMD_S24
 #define ULPTX_CMD_V(x) ((x) << ULPTX_CMD_S)
 
+#define ULPTX_LEN16_S0
+#define ULPTX_LEN16_M0xFF
+#define ULPTX_LEN16_V(x) ((x) << ULPTX_LEN16_S)
+
+#define ULP_TX_SC_MORE_S 23
+#define ULP_TX_SC_MORE_V(x) ((x) << ULP_TX_SC_MORE_S)
+#define ULP_TX_SC_MORE_F  ULP_TX_SC_MORE_V(1U)
+
 struct ulptx_sge_pair {
__be32 len[2];
__be64 addr[2];
@@ -1948,4 +1969,102 @@ enum {
X_CPL_RX_MPS_PKT_TYPE_QFC   = 1 << 2,
X_CPL_RX_MPS_PKT_TYPE_PTP   = 1 << 3
 };
+
+struct cpl_tx_tls_sfo {
+   __be32 op_to_seg_len;
+   __be32 pld_len;
+   __be32 type_protover;
+   __be32 r1_lo;
+   __be32 seqno_numivs;
+   __be32 ivgen_hdrlen;
+   __be64 scmd1;
+};
+
+/* cpl_tx_tls_sfo macros */
+#define CPL_TX_TLS_SFO_OPCODE_S 24
+#define CPL_TX_TLS_SFO_OPCODE_V(x)  ((x) << CPL_TX_TLS_SFO_OPCODE_S)
+
+#define CPL_TX_TLS_SFO_DATA_TYPE_S  20
+#define CPL_TX_TLS_SFO_DATA_TYPE_V(x)   ((x) << CPL_TX_TLS_SFO_DATA_TYPE_S)
+
+#define CPL_TX_TLS_SFO_CPL_LEN_S16
+#define CPL_TX_TLS_SFO_CPL_LEN_V(x) ((x) << CPL_TX_TLS_SFO_CPL_LEN_S)
+
+#define CPL_TX_TLS_SFO_SEG_LEN_S0
+#define CPL_TX_TLS_SFO_SEG_LEN_M0x
+#define CPL_TX_TLS_SFO_SEG_LEN_V(x) ((x) << CPL_TX_TLS_SFO_SEG_LEN_S)
+#define CPL_TX_TLS_SFO_SEG_LEN_G(x) \
+   (((x) >> CPL_TX_TLS_SFO_SEG_LEN_S) & CPL_TX_TLS_SFO_SEG_LEN_M)
+
+#define CPL_TX_TLS_SFO_TYPE_S   24
+#define CPL_TX_TLS_SFO_TYPE_M   0xff
+#define CPL_TX_TLS_SFO_TYPE_V(x)((x) << CPL_TX_TLS_SFO_TYPE_S)
+#define CPL_TX_TLS_SFO_TYPE_G(x)\
+   (((x) >> CPL_TX_TLS_SFO_TYPE_S) & CPL_TX_TLS_SFO_TYPE_M)
+
+#define CPL_TX_TLS_SFO_PROTOVER_S   8
+#define CPL_TX_TLS_SFO_PROTOVER_M   0x
+#define CPL_TX_TLS_SFO_PROTOVER_V(x)((x) << CPL_TX_TLS_SFO_PROTOVER_S)
+#define CPL_TX_TLS_SFO_PROTOVER_G(x)\
+   (((x) >> CPL_TX_TLS_SFO_PROTOVER_S) & CPL_TX_TLS_SFO_PROTOVER_M)
+
+struct cpl_tls_data {
+   struct rss_header rsshdr;
+   union opcode_tid ot;
+   __be32 length_pkd;
+   __be32 seq;
+   __be32 r1;
+};
+
+#define CPL_TLS_DATA_OPCODE_S   24
+#define CPL_TLS_DATA_OPCODE_M   0xff
+#define CPL_TLS_DATA_OPCODE_V(x)((x) << CPL_TLS_DATA_OPCODE_S)
+#define CPL_TLS_DATA_OPCODE_G(x)\
+   (((x) >> CPL_TLS_DATA_OPCODE_S) & CPL_TLS_DATA_OPCODE_M)
+
+#define CPL_TLS_DATA_TID_S  0
+#define CPL_TLS_DATA_TID_M  0xff
+#define CPL_TLS_DATA_TID_V(x)   ((x) << CPL_TLS_DATA_TID_S)
+#define CPL_TLS_DATA_TID_G(x)   \
+   (((x) >> CPL_TLS_DATA_TID_S) & CPL_TLS_DATA_TID_M)
+
+#define CPL_TLS_DATA_LENGTH_S   0
+#define CPL_TLS_DATA_LENGTH_M   0x
+#define CPL_TLS_DATA_LENGTH_V(x)((x) 

[RFC crypto v3 1/9] chtls: structure and macro definiton

2017-12-20 Thread Atul Gupta
Inline TLS state, connection management. Supporting macros definition.

Signed-off-by: Atul Gupta 
---
 drivers/crypto/chelsio/chtls/chtls.h| 480 
 drivers/crypto/chelsio/chtls/chtls_cm.h | 203 ++
 2 files changed, 683 insertions(+)
 create mode 100644 drivers/crypto/chelsio/chtls/chtls.h
 create mode 100644 drivers/crypto/chelsio/chtls/chtls_cm.h

diff --git a/drivers/crypto/chelsio/chtls/chtls.h 
b/drivers/crypto/chelsio/chtls/chtls.h
new file mode 100644
index 000..33d3ed4
--- /dev/null
+++ b/drivers/crypto/chelsio/chtls/chtls.h
@@ -0,0 +1,480 @@
+/*
+ * Copyright (c) 2016 Chelsio Communications, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __CHTLS_H__
+#define __CHTLS_H__
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "t4fw_api.h"
+#include "t4_msg.h"
+#include "cxgb4.h"
+#include "cxgb4_uld.h"
+#include "l2t.h"
+#include "chcr_algo.h"
+#include "chcr_core.h"
+#include "chcr_crypto.h"
+
+#define CIPHER_BLOCK_SIZE   16
+#define MAX_IVS_PAGE256
+#define TLS_KEY_CONTEXT_SZ 64
+#define TLS_HEADER_LENGTH  5
+#define SCMD_CIPH_MODE_AES_GCM  2
+#define GCM_TAG_SIZE16
+#define AEAD_EXPLICIT_DATA_SIZE 8
+/* Any MFS size should work and come from openssl */
+#define TLS_MFS16384
+
+#define SOCK_INLINE (31)
+#define RSS_HDR sizeof(struct rss_header)
+
+enum {
+   CHTLS_KEY_CONTEXT_DSGL,
+   CHTLS_KEY_CONTEXT_IMM,
+   CHTLS_KEY_CONTEXT_DDR,
+};
+
+enum {
+   CHTLS_LISTEN_START,
+   CHTLS_LISTEN_STOP,
+};
+
+/* Flags for return value of CPL message handlers */
+enum {
+   CPL_RET_BUF_DONE = 1,   /* buffer processing done */
+   CPL_RET_BAD_MSG = 2,/* bad CPL message */
+   CPL_RET_UNKNOWN_TID = 4 /* unexpected unknown TID */
+};
+
+#define TLS_RCV_ST_READ_HEADER  0xF0
+#define TLS_RCV_ST_READ_BODY0xF1
+#define TLS_RCV_ST_READ_DONE0xF2
+#define TLS_RCV_ST_READ_NB  0xF3
+
+#define RSPQ_HASH_BITS 5
+#define LISTEN_INFO_HASH_SIZE 32
+struct listen_info {
+   struct listen_info *next;  /* Link to next entry */
+   struct sock *sk;   /* The listening socket */
+   unsigned int stid; /* The server TID */
+};
+
+enum {
+   T4_LISTEN_START_PENDING,
+   T4_LISTEN_STARTED
+};
+
+enum csk_flags {
+   CSK_CALLBACKS_CHKD, /* socket callbacks have been sanitized */
+   CSK_ABORT_REQ_RCVD, /* received one ABORT_REQ_RSS message */
+   CSK_TX_MORE_DATA,   /* sending ULP data; don't set SHOVE bit */
+   CSK_TX_WAIT_IDLE,   /* suspend Tx until in-flight data is ACKed */
+   CSK_ABORT_SHUTDOWN, /* shouldn't send more abort requests */
+   CSK_ABORT_RPL_PENDING,  /* expecting an abort reply */
+   CSK_CLOSE_CON_REQUESTED,/* we've sent a close_conn_req */
+   CSK_TX_DATA_SENT,   /* sent a TX_DATA WR on this connection */
+   CSK_TX_FAILOVER,/* Tx traffic failing over */
+   CSK_UPDATE_RCV_WND, /* Need to update rcv window */
+   CSK_RST_ABORTED,/* outgoing RST was aborted */
+   CSK_TLS_HANDSHK,/* TLS Handshake */
+};
+
+struct listen_ctx {
+   struct sock *lsk;
+   struct chtls_dev *cdev;
+   u32 state;
+};
+
+struct key_map {
+   unsigned long *addr;
+   unsigned int start;
+   unsigned int available;
+   unsigned int size;
+   spinlock_t lock; /* lock for key id request from map */
+} __packed;
+
+struct tls_scmd {
+   __be32 seqno_numivs;
+   __be32 ivgen_hdrlen;
+};
+
+struct chtls_dev {
+   struct list_head list;
+   struct cxgb4_lld_info *lldi;
+   struct pci_dev *pdev;
+   struct listen_info *listen_hash_tab[LISTEN_INFO_HASH_SIZE];
+   spinlock_t listen_lock; /* lock for listen list */
+   struct net_device **ports;
+   struct tid_info *tids;
+   unsigned int pfvf;
+   const unsigned short *mtus;
+
+   spinlock_t aidr_lock cacheline_aligned_in_smp;
+   struct idr aidr; /* ATID id space */
+   struct idr hwtid_idr;
+   struct idr stid_idr;
+
+   spinlock_t idr_lock cacheline_aligned_in_smp;
+
+   struct net_device *egr_dev[NCHAN * 2];
+   struct sk_buff *rspq_skb_cache[1 << RSPQ_HASH_BITS];
+   struct sk_buff *askb;
+
+   struct sk_buff_head deferq;
+   struct work_struct deferq_task;
+
+   struct list_head list_node;
+   struct list_head rcu_node;
+   struct list_head na_node;
+   unsigned int send_page_order;
+   struct key_map kmap;
+};
+
+struct chtls_hws {
+   struct sk_buff_head sk_recv_queue;
+   u8 txqid

[RFC crypto v3 0/9] Chelsio Inline TLS

2017-12-20 Thread Atul Gupta
RFC series for Chelsio Inline TLS driver (chtls.ko)

Driver use the ULP infrastructure to register chtls as Inline TLS ULP.
Chtls use TCP Sockets to transmit and receive TLS record. TCP proto_ops
is extended to offload TLS record.

T6 adapter provides the following features:
-TLS record offload, TLS header, encrypt, digest and transmit
-TLS record receive and decrypt
-TLS keys store
-TCP/IP engine
-TLS engine
-GCM crypto engine [support CBC also]

TLS provides security at the transport layer. It uses TCP to provide
reliable end-to-end transport of application data. It relies on TCP
for any retransmission. TLS session comprises of three parts:
a. TCP/IP connection
b. TLS handshake
c. Record layer processing

TLS handshake state machine is executed in host (refer standard
implementation eg. OpenSSL).  Setsockopt [SOL_TCP, TCP_ULP] initialize
TCP proto-ops for Chelsio inline tls support. setsockopt(sock, SOL_TCP,
TCP_ULP, "chtls", sizeof("chtls"));

Tx and Rx Keys are decided during handshake and programmed onto the chip
after CCS is exchanged.
struct tls12_crypto_info_aes_gcm_128 crypto_info
setsockopt(sock, SOL_TLS, TLS_TX, &crypto_info, sizeof(crypto_info))
Finish is the first encrypted/decrypted message tx/rx inline.

On the Tx path TLS engine receive plain text from openssl, insert IV,
fetches the tx key, create cipher text records and generate MAC. TLS
header is added to cipher text and forward to TCP/IP engine for transport
layer processing and transmission on wire.
TX:
Application--openssl--chtls---TLS engine---encrypt/auth---TCP/IP
engine---wire.

On the Rx side, data received is PDU aligned at record
boundaries. TLS processes only the complete record. If rx key is programmed
on CCS receive, data is decrypted and plain text is posted to host.
RX:
Wire--cipher-text--TCP/IP engine [PDU align]---TLS engine---
decrypt/auth---plain-text--chtls--openssl--application

v3: fixed the kbuild test issues
   -made few funtions static
   -initialized few variables

v2: fixed the following based on the review comments of Stephan Mueller,
Stefano Brivio and Hannes Frederic
-Added more details in cover letter
-Fixed indentation and formating issues
-Using aes instead of aes-generic
-memset key info after programing the key on chip
-reordered the patch sequence

Atul Gupta (9):
  chtls: structure and macro definiton
  cxgb4: Inline TLS FW Interface
  cxgb4: LLD driver changes to enable TLS
  chcr: Key Macro
  chtls: Key program
  chtls: CPL handler definition
  chtls: Inline crypto request Tx/Rx
  chtls: Register the ULP
  Makefile Kconfig

 drivers/crypto/chelsio/Kconfig |   10 +
 drivers/crypto/chelsio/Makefile|1 +
 drivers/crypto/chelsio/chcr_algo.h |   42 +
 drivers/crypto/chelsio/chcr_core.h |   55 +-
 drivers/crypto/chelsio/chtls/Makefile  |4 +
 drivers/crypto/chelsio/chtls/chtls.h   |  480 +
 drivers/crypto/chelsio/chtls/chtls_cm.c| 2045 
 drivers/crypto/chelsio/chtls/chtls_cm.h|  203 ++
 drivers/crypto/chelsio/chtls/chtls_hw.c|  394 
 drivers/crypto/chelsio/chtls/chtls_io.c| 1867 ++
 drivers/crypto/chelsio/chtls/chtls_main.c  |  584 ++
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_debugfs.c |   18 +-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c|   32 +-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h |7 +
 drivers/net/ethernet/chelsio/cxgb4/sge.c   |   98 +-
 drivers/net/ethernet/chelsio/cxgb4/t4_msg.h|  121 +-
 drivers/net/ethernet/chelsio/cxgb4/t4_regs.h   |2 +
 drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h  |  165 +-
 include/uapi/linux/tls.h   |1 +
 net/ipv4/tcp_minisocks.c   |1 +
 20 files changed, 6111 insertions(+), 19 deletions(-)
 create mode 100644 drivers/crypto/chelsio/chtls/Makefile
 create mode 100644 drivers/crypto/chelsio/chtls/chtls.h
 create mode 100644 drivers/crypto/chelsio/chtls/chtls_cm.c
 create mode 100644 drivers/crypto/chelsio/chtls/chtls_cm.h
 create mode 100644 drivers/crypto/chelsio/chtls/chtls_hw.c
 create mode 100644 drivers/crypto/chelsio/chtls/chtls_io.c
 create mode 100644 drivers/crypto/chelsio/chtls/chtls_main.c

-- 
1.8.3.1



Re: KASAN: use-after-free Read in crypto_aead_free_instance

2017-12-20 Thread Dmitry Vyukov
On Wed, Dec 20, 2017 at 10:55 AM, Stephan Mueller  wrote:
> Am Mittwoch, 20. Dezember 2017, 10:50:10 CET schrieb Dmitry Vyukov:
>
> Hi Dmitry,
>
>> On Wed, Dec 20, 2017 at 10:29 AM, Stephan Mueller 
> wrote:
>> > Am Mittwoch, 20. Dezember 2017, 10:19:43 CET schrieb Dmitry Vyukov:
>> >
>> > Hi Dmitry,
>> >
>> >> > This issue vanishes after applying the patch "[PATCH v2] crypto: AF_ALG
>> >> > -
>> >> > limit mask and type".
>> >>
>> >> Hi Stephan,
>> >>
>> >> syzbot does not understand arbitrary English prose, it only understands
>> >
>> > this:
>> >> > Once a fix for this bug is merged into any tree, reply to this email
>> >> > with:
>> >> > #syz fix: exact-commit-title
>> >>
>> >> Let's tell it about the fix:
>> >>
>> >> #syz fix: crypto: AF_ALG - limit mask and type
>> >
>> > I have seen that this is the approach, but the fix is not yet in the tree.
>> > I just want to let folks know that there is a patch.
>>
>> Ah, ok, sorry. It's just difficult to tell when there is a reason to
>> not provide the tag right now, or when people are don't know about
>> them or ignore.
>> If the patch is merged with this title, then there is nothing else to
>> do. If it's merged under a different title, a new "#syz fix:" tag will
>> override the old one.
>
> Maybe you can teach the syzcaller that there is a proposed fix? E.g.
>
> #syz proposed: commit-title

What will be its meaning? How will it differ from fix?


Re: KASAN: use-after-free Read in crypto_aead_free_instance

2017-12-20 Thread syzbot

Am Mittwoch, 20. Dezember 2017, 10:50:10 CET schrieb Dmitry Vyukov:



Hi Dmitry,



On Wed, Dec 20, 2017 at 10:29 AM, Stephan Mueller 

wrote:

> Am Mittwoch, 20. Dezember 2017, 10:19:43 CET schrieb Dmitry Vyukov:
>
> Hi Dmitry,
>
>> > This issue vanishes after applying the patch "[PATCH v2] crypto:  
AF_ALG

>> > -
>> > limit mask and type".
>>
>> Hi Stephan,
>>
>> syzbot does not understand arbitrary English prose, it only  
understands

>
> this:
>> > Once a fix for this bug is merged into any tree, reply to this email
>> > with:
>> > #syz fix: exact-commit-title
>>
>> Let's tell it about the fix:
>>
>> #syz fix: crypto: AF_ALG - limit mask and type
>
> I have seen that this is the approach, but the fix is not yet in the  
tree.

> I just want to let folks know that there is a patch.



Ah, ok, sorry. It's just difficult to tell when there is a reason to
not provide the tag right now, or when people are don't know about
them or ignore.
If the patch is merged with this title, then there is nothing else to
do. If it's merged under a different title, a new "#syz fix:" tag will
override the old one.



Maybe you can teach the syzcaller that there is a proposed fix? E.g.



#syz proposed: commit-title


unknown command "proposed:"




Ciao
Stephan


Re: KASAN: use-after-free Read in crypto_aead_free_instance

2017-12-20 Thread Stephan Mueller
Am Mittwoch, 20. Dezember 2017, 10:50:10 CET schrieb Dmitry Vyukov:

Hi Dmitry,

> On Wed, Dec 20, 2017 at 10:29 AM, Stephan Mueller  
wrote:
> > Am Mittwoch, 20. Dezember 2017, 10:19:43 CET schrieb Dmitry Vyukov:
> > 
> > Hi Dmitry,
> > 
> >> > This issue vanishes after applying the patch "[PATCH v2] crypto: AF_ALG
> >> > -
> >> > limit mask and type".
> >> 
> >> Hi Stephan,
> >> 
> >> syzbot does not understand arbitrary English prose, it only understands
> > 
> > this:
> >> > Once a fix for this bug is merged into any tree, reply to this email
> >> > with:
> >> > #syz fix: exact-commit-title
> >> 
> >> Let's tell it about the fix:
> >> 
> >> #syz fix: crypto: AF_ALG - limit mask and type
> > 
> > I have seen that this is the approach, but the fix is not yet in the tree.
> > I just want to let folks know that there is a patch.
> 
> Ah, ok, sorry. It's just difficult to tell when there is a reason to
> not provide the tag right now, or when people are don't know about
> them or ignore.
> If the patch is merged with this title, then there is nothing else to
> do. If it's merged under a different title, a new "#syz fix:" tag will
> override the old one.

Maybe you can teach the syzcaller that there is a proposed fix? E.g.

#syz proposed: commit-title


Ciao
Stephan


Re: KASAN: use-after-free Read in crypto_aead_free_instance

2017-12-20 Thread syzbot

Am Mittwoch, 20. Dezember 2017, 10:50:10 CET schrieb Dmitry Vyukov:



Hi Dmitry,



On Wed, Dec 20, 2017 at 10:29 AM, Stephan Mueller 

wrote:

> Am Mittwoch, 20. Dezember 2017, 10:19:43 CET schrieb Dmitry Vyukov:
>
> Hi Dmitry,
>
>> > This issue vanishes after applying the patch "[PATCH v2] crypto:  
AF_ALG

>> > -
>> > limit mask and type".
>>
>> Hi Stephan,
>>
>> syzbot does not understand arbitrary English prose, it only  
understands

>
> this:
>> > Once a fix for this bug is merged into any tree, reply to this email
>> > with:
>> > #syz fix: exact-commit-title
>>
>> Let's tell it about the fix:
>>
>> #syz fix: crypto: AF_ALG - limit mask and type
>
> I have seen that this is the approach, but the fix is not yet in the  
tree.

> I just want to let folks know that there is a patch.



Ah, ok, sorry. It's just difficult to tell when there is a reason to
not provide the tag right now, or when people are don't know about
them or ignore.
If the patch is merged with this title, then there is nothing else to
do. If it's merged under a different title, a new "#syz fix:" tag will
override the old one.



Maybe you can teach the syzcaller that there is a proposed fix? E.g.



#syz proposed: commit-title


unknown command "proposed:"




Ciao
Stephan



--
You received this message because you are subscribed to the Google  
Groups "syzkaller-bugs" group.
To unsubscribe from this group and stop receiving emails from it, send an  
email to syzkaller-bugs+unsubscr...@googlegroups.com.
To view this discussion on the web visit  
https://groups.google.com/d/msgid/syzkaller-bugs/18467907.EfXNf1iGip%40tauon.chronox.de.

For more options, visit https://groups.google.com/d/optout.


Re: KASAN: use-after-free Read in crypto_aead_free_instance

2017-12-20 Thread Dmitry Vyukov
On Wed, Dec 20, 2017 at 10:29 AM, Stephan Mueller  wrote:
> Am Mittwoch, 20. Dezember 2017, 10:19:43 CET schrieb Dmitry Vyukov:
>
> Hi Dmitry,
>> >
>> > This issue vanishes after applying the patch "[PATCH v2] crypto: AF_ALG -
>> > limit mask and type".
>>
>> Hi Stephan,
>>
>> syzbot does not understand arbitrary English prose, it only understands
> this:
>> > Once a fix for this bug is merged into any tree, reply to this email with:
>> > #syz fix: exact-commit-title
>>
>> Let's tell it about the fix:
>>
>> #syz fix: crypto: AF_ALG - limit mask and type
>
> I have seen that this is the approach, but the fix is not yet in the tree. I
> just want to let folks know that there is a patch.

Ah, ok, sorry. It's just difficult to tell when there is a reason to
not provide the tag right now, or when people are don't know about
them or ignore.
If the patch is merged with this title, then there is nothing else to
do. If it's merged under a different title, a new "#syz fix:" tag will
override the old one.


Re: [PATCH] checkpatch: add *_ON_STACK to $declaration_macros

2017-12-20 Thread Joe Perches
On Tue, 2017-06-27 at 10:55 +0300, Gilad Ben-Yossef wrote:
> Add the crypto API *_ON_STACK to $declaration_macros.
> 
> Resolves the following false warning:
> 
> WARNING: Missing a blank line after declarations
> + int err;
> + SHASH_DESC_ON_STACK(desc, ctx_p->shash_tfm);
[]
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
[]
> @@ -733,6 +733,7 @@ our $FuncArg = 
> qr{$Typecast{0,1}($LvalOrFunc|$Constant|$String)};
>  our $declaration_macros = qr{(?x:
>   
> (?:$Storage\s+)?(?:[A-Z_][A-Z0-9]*_){0,2}(?:DEFINE|DECLARE)(?:_[A-Z0-9]+){1,6}\s*\(|
>   (?:$Storage\s+)?LIST_HEAD\s*\(|
> + (?:$Storage\s+)?[A-Z_]*_ON_STACK\(|

A few things:

The crypto _ON_STACK declarations cannot have a $Storage type.

There should be a \s* between the ON_STACK and the open parenthesis

There are other _ON_STACK types than the crypto uses

$ grep -rP --include=*.[ch] -oh "\b[A-Z_]+_ON_STACK\b" * | \
  sort | uniq -c | sort -rn
 68 SKCIPHER_REQUEST_ON_STACK
 45 SHASH_DESC_ON_STACK
 10 AHASH_REQUEST_ON_STACK
  4 PC_LOC_ON_STACK
  4 DECLARE_DPM_WATCHDOG_ON_STACK
  3 HISI_SAS_DECLARE_RST_WORK_ON_STACK
  3 CONFIG_ARCH_TASK_STRUCT_ON_STACK
  1 PT_SIZE_ON_STACK
  1 ALLOC_PT_GPREGS_ON_STACK

So perhaps:

(?:SKCIPHER_REQUEST|SHASH_DESC|AHASH_REQUEST)_ON_STACK\s*\(



Re: [PATCH RESEND] checkpatch: add *_ON_STACK to declaration_macros

2017-12-20 Thread Greg Kroah-Hartman
On Wed, Dec 20, 2017 at 08:31:19AM +, Gilad Ben-Yossef wrote:
> Add the crypto API *_ON_STACK to $declaration_macros.
> 
> Resolves the following false warning:
> 
> WARNING: Missing a blank line after declarations
> + int err;
> + SHASH_DESC_ON_STACK(desc, ctx_p->shash_tfm);
> 
> Signed-off-by: Gilad Ben-Yossef 

Acked-by: Greg Kroah-Hartman 


Re: KASAN: use-after-free Read in crypto_aead_free_instance

2017-12-20 Thread Stephan Mueller
Am Mittwoch, 20. Dezember 2017, 10:19:43 CET schrieb Dmitry Vyukov:

Hi Dmitry,
> > 
> > This issue vanishes after applying the patch "[PATCH v2] crypto: AF_ALG -
> > limit mask and type".
> 
> Hi Stephan,
> 
> syzbot does not understand arbitrary English prose, it only understands 
this:
> > Once a fix for this bug is merged into any tree, reply to this email with:
> > #syz fix: exact-commit-title
> 
> Let's tell it about the fix:
> 
> #syz fix: crypto: AF_ALG - limit mask and type

I have seen that this is the approach, but the fix is not yet in the tree. I 
just want to let folks know that there is a patch.


Ciao
Stephan


Re: KASAN: use-after-free Read in crypto_aead_free_instance

2017-12-20 Thread Dmitry Vyukov
On Wed, Dec 20, 2017 at 10:17 AM, Stephan Müller  wrote:
> Am Mittwoch, 20. Dezember 2017, 08:48:01 CET schrieb syzbot:
>
> Hi,
>
>> Hello,
>>
>> syzkaller hit the following crash on
>> 032b4cc8ff84490c4bc7c4ef8c91e6d83a637538
>> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
>> compiler: gcc (GCC) 7.1.1 20170620
>> .config is attached
>> Raw console output is attached.
>> C reproducer is attached
>> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
>> for information about syzkaller reproducers
>>
>>
>> ==
>> BUG: KASAN: use-after-free in crypto_aead_free_instance+0xc0/0xd0
>> crypto/aead.c:154
>> Read of size 8 at addr 8801c32cf240 by task cryptomgr_test/6646
>>
>> CPU: 1 PID: 6646 Comm: cryptomgr_test Not tainted 4.15.0-rc3+ #132
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
>> Google 01/01/2011
>> Call Trace:
>>   __dump_stack lib/dump_stack.c:17 [inline]
>>   dump_stack+0x194/0x257 lib/dump_stack.c:53
>>   print_address_description+0x73/0x250 mm/kasan/report.c:252
>>   kasan_report_error mm/kasan/report.c:351 [inline]
>>   kasan_report+0x25b/0x340 mm/kasan/report.c:409
>>   __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:430
>>   crypto_aead_free_instance+0xc0/0xd0 crypto/aead.c:154
>>   crypto_free_instance+0x6d/0x100 crypto/algapi.c:77
>>   crypto_destroy_instance+0x3c/0x80 crypto/algapi.c:85
>>   crypto_alg_put crypto/internal.h:116 [inline]
>>   crypto_remove_final+0x212/0x370 crypto/algapi.c:331
>>   crypto_alg_tested+0x445/0x6f0 crypto/algapi.c:320
>>   cryptomgr_test+0x17/0x30 crypto/algboss.c:226
>>   kthread+0x37a/0x440 kernel/kthread.c:238
>>   ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:441
>>
>> Allocated by task 6641:
>>   save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>>   set_track mm/kasan/kasan.c:459 [inline]
>>   kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
>>   kmem_cache_alloc_trace+0x136/0x750 mm/slab.c:3610
>>   kmalloc include/linux/slab.h:499 [inline]
>>   kzalloc include/linux/slab.h:688 [inline]
>>   pcrypt_create_aead crypto/pcrypt.c:291 [inline]
>>   pcrypt_create+0x137/0x6c0 crypto/pcrypt.c:346
>>   cryptomgr_probe+0x74/0x240 crypto/algboss.c:75
>>   kthread+0x37a/0x440 kernel/kthread.c:238
>>   ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:441
>>
>> Freed by task 3335:
>>   save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>>   set_track mm/kasan/kasan.c:459 [inline]
>>   kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524
>>   __cache_free mm/slab.c:3488 [inline]
>>   kfree+0xca/0x250 mm/slab.c:3803
>>   crypto_larval_destroy+0x110/0x150 crypto/api.c:107
>>   crypto_alg_put crypto/internal.h:116 [inline]
>>   crypto_larval_kill+0x1e8/0x2e0 crypto/api.c:167
>>   crypto_alg_mod_lookup+0x178/0x1b0 crypto/api.c:283
>>   crypto_find_alg crypto/api.c:501 [inline]
>>   crypto_alloc_tfm+0xf3/0x2f0 crypto/api.c:534
>>   crypto_alloc_aead+0x2c/0x40 crypto/aead.c:342
>>   aead_bind+0x70/0x140 crypto/algif_aead.c:482
>>   alg_bind+0x1ab/0x440 crypto/af_alg.c:179
>>   SYSC_bind+0x1b4/0x3f0 net/socket.c:1454
>>   SyS_bind+0x24/0x30 net/socket.c:1440
>>   do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
>>   do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
>>   entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125
>>
> This issue vanishes after applying the patch "[PATCH v2] crypto: AF_ALG -
> limit mask and type".


Hi Stephan,

syzbot does not understand arbitrary English prose, it only understands this:

> Once a fix for this bug is merged into any tree, reply to this email with:
> #syz fix: exact-commit-title

Let's tell it about the fix:

#syz fix: crypto: AF_ALG - limit mask and type


Re: KASAN: use-after-free Read in crypto_aead_free_instance

2017-12-20 Thread Stephan Müller
Am Mittwoch, 20. Dezember 2017, 08:48:01 CET schrieb syzbot:

Hi,

> Hello,
> 
> syzkaller hit the following crash on
> 032b4cc8ff84490c4bc7c4ef8c91e6d83a637538
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
> C reproducer is attached
> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> for information about syzkaller reproducers
> 
> 
> ==
> BUG: KASAN: use-after-free in crypto_aead_free_instance+0xc0/0xd0
> crypto/aead.c:154
> Read of size 8 at addr 8801c32cf240 by task cryptomgr_test/6646
> 
> CPU: 1 PID: 6646 Comm: cryptomgr_test Not tainted 4.15.0-rc3+ #132
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>   __dump_stack lib/dump_stack.c:17 [inline]
>   dump_stack+0x194/0x257 lib/dump_stack.c:53
>   print_address_description+0x73/0x250 mm/kasan/report.c:252
>   kasan_report_error mm/kasan/report.c:351 [inline]
>   kasan_report+0x25b/0x340 mm/kasan/report.c:409
>   __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:430
>   crypto_aead_free_instance+0xc0/0xd0 crypto/aead.c:154
>   crypto_free_instance+0x6d/0x100 crypto/algapi.c:77
>   crypto_destroy_instance+0x3c/0x80 crypto/algapi.c:85
>   crypto_alg_put crypto/internal.h:116 [inline]
>   crypto_remove_final+0x212/0x370 crypto/algapi.c:331
>   crypto_alg_tested+0x445/0x6f0 crypto/algapi.c:320
>   cryptomgr_test+0x17/0x30 crypto/algboss.c:226
>   kthread+0x37a/0x440 kernel/kthread.c:238
>   ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:441
> 
> Allocated by task 6641:
>   save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>   set_track mm/kasan/kasan.c:459 [inline]
>   kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
>   kmem_cache_alloc_trace+0x136/0x750 mm/slab.c:3610
>   kmalloc include/linux/slab.h:499 [inline]
>   kzalloc include/linux/slab.h:688 [inline]
>   pcrypt_create_aead crypto/pcrypt.c:291 [inline]
>   pcrypt_create+0x137/0x6c0 crypto/pcrypt.c:346
>   cryptomgr_probe+0x74/0x240 crypto/algboss.c:75
>   kthread+0x37a/0x440 kernel/kthread.c:238
>   ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:441
> 
> Freed by task 3335:
>   save_stack+0x43/0xd0 mm/kasan/kasan.c:447
>   set_track mm/kasan/kasan.c:459 [inline]
>   kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524
>   __cache_free mm/slab.c:3488 [inline]
>   kfree+0xca/0x250 mm/slab.c:3803
>   crypto_larval_destroy+0x110/0x150 crypto/api.c:107
>   crypto_alg_put crypto/internal.h:116 [inline]
>   crypto_larval_kill+0x1e8/0x2e0 crypto/api.c:167
>   crypto_alg_mod_lookup+0x178/0x1b0 crypto/api.c:283
>   crypto_find_alg crypto/api.c:501 [inline]
>   crypto_alloc_tfm+0xf3/0x2f0 crypto/api.c:534
>   crypto_alloc_aead+0x2c/0x40 crypto/aead.c:342
>   aead_bind+0x70/0x140 crypto/algif_aead.c:482
>   alg_bind+0x1ab/0x440 crypto/af_alg.c:179
>   SYSC_bind+0x1b4/0x3f0 net/socket.c:1454
>   SyS_bind+0x24/0x30 net/socket.c:1440
>   do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
>   do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
>   entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125
> 
This issue vanishes after applying the patch "[PATCH v2] crypto: AF_ALG - 
limit mask and type".

Ciao
Stephan


Re: [PATCH v2 2/2] crypto: Fix out-of-bounds access of the AAD buffer in generic-gcm-aesni

2017-12-20 Thread Eric Biggers
On Tue, Dec 19, 2017 at 08:42:59PM -0800, Junaid Shahid wrote:
> The aesni_gcm_enc/dec functions can access memory after the end of
> the AAD buffer if the AAD length is not a multiple of 4 bytes.
> It didn't matter with rfc4106-gcm-aesni as in that case the AAD was
> always followed by the 8 byte IV, but that is no longer the case with
> generic-gcm-aesni. This can potentially result in accessing a page that
> is not mapped and thus causing the machine to crash. This patch fixes
> that by reading the last <16 byte block of the AAD byte-by-byte and
> optionally via an 8-byte load if the block was at least 8 bytes.
> 
> Fixes: 0487ccac ("crypto: aesni - make non-AVX AES-GCM work with any aadlen")
[...]
> -_get_AAD_rest0\num_initial_blocks\operation:
> - /* finalize: shift out the extra bytes we read, and align
> - left. since pslldq can only shift by an immediate, we use
> - vpshufb and an array of shuffle masks */
> - movq   %r12, %r11
> - salq   $4, %r11
> - movdqu aad_shift_arr(%r11), \TMP1
> - PSHUFB_XMM \TMP1, %xmm\i

aad_shift_arr is no longer used, so it should be removed.

> -_get_AAD_rest_final\num_initial_blocks\operation:
> + READ_PARTIAL_BLOCK %r10, %r12, %r11, \TMP1, \TMP2, %xmm\i

It seems that INITIAL_BLOCKS_DEC and INITIAL_BLOCKS_ENC maintain both %r11 and
%r12 as the AAD length.  %r11 is used for real earlier, but here %r12 is used
for real while %r11 is a temporary register.  Can this be simplified to have the
AAD length in %r11 only?

Eric


Re: [PATCH v2 1/2] crypto: Fix out-of-bounds access of the data buffer in generic-gcm-aesni

2017-12-20 Thread Eric Biggers
On Tue, Dec 19, 2017 at 08:42:58PM -0800, Junaid Shahid wrote:
> The aesni_gcm_enc/dec functions can access memory before the start of
> the data buffer if the length of the data buffer is less than 16 bytes.
> This is because they perform the read via a single 16-byte load. This
> can potentially result in accessing a page that is not mapped and thus
> causing the machine to crash. This patch fixes that by reading the
> partial block byte-by-byte and optionally an via 8-byte load if the block
> was at least 8 bytes.
> 
> Fixes: 0487ccac ("crypto: aesni - make non-AVX AES-GCM work with any aadlen")
> Signed-off-by: Junaid Shahid 

Did you run the self-tests (boot with CONFIG_CRYPTO_MANAGER_DISABLE_TESTS
unset)?  The second patch causes them to start failing:

[1.169640] alg: aead: Test 7 failed on encryption for rfc4106-gcm-aesni
[1.178308] alg: aead: Test 4 failed on encryption for generic-gcm-aesni

Also, don't the AVX and AVX2 versions have the same bug?  These patches only fix
the non-AVX version.

> +# read the last <16 byte block
> +# Clobbers %rax, DPTR, TMP1 and XMM1-2
> +.macro READ_PARTIAL_BLOCK DPTR DLEN TMP1 XMM1 XMM2 XMMDst
> +pxor \XMMDst, \XMMDst
> +mov \DLEN, \TMP1
> +cmp $8, \DLEN
> +jl _read_last_lt8_\@
> +mov (\DPTR), %rax
> +MOVQ_R64_XMM %rax, \XMMDst
> + add $8, \DPTR
> +sub $8, \TMP1
> +jz _done_read_last_partial_block_\@
> +_read_last_lt8_\@:
> + shl $8, %rax
> +mov -1(\DPTR, \TMP1, 1), %al
> +dec \TMP1
> +jnz _read_last_lt8_\@
> +MOVQ_R64_XMM %rax, \XMM1

Doesn't %rax needs to be zeroed at '_read_last_lt8' so that the padding is
zeroed?

> + # adjust the shuffle mask pointer to be able to shift either 0 or 8
> + # bytes depending on whether the last block is <8 bytes or not
> +mov \DLEN, \TMP1
> +and $8, \TMP1
> + lea SHIFT_MASK(%rip), %rax
> + sub \TMP1, %rax
> + movdqu (%rax), \XMM2# get the appropriate shuffle mask
> + PSHUFB_XMM \XMM2, \XMM1 # shift left either 0 or 8 bytes

Is there any way this can be simplified to use pslldq?  The shift is just 8
bytes or nothing, and we already had to branch earlier depending on whether we
needed to read the 8 bytes or not.

Eric


[PATCH RESEND] checkpatch: add *_ON_STACK to declaration_macros

2017-12-20 Thread Gilad Ben-Yossef
Add the crypto API *_ON_STACK to $declaration_macros.

Resolves the following false warning:

WARNING: Missing a blank line after declarations
+   int err;
+   SHASH_DESC_ON_STACK(desc, ctx_p->shash_tfm);

Signed-off-by: Gilad Ben-Yossef 
---
 scripts/checkpatch.pl | 1 +
 1 file changed, 1 insertion(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 168687a..e7fa1a2 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -759,6 +759,7 @@ our $FuncArg = 
qr{$Typecast{0,1}($LvalOrFunc|$Constant|$String)};
 our $declaration_macros = qr{(?x:

(?:$Storage\s+)?(?:[A-Z_][A-Z0-9]*_){0,2}(?:DEFINE|DECLARE)(?:_[A-Z0-9]+){1,6}\s*\(|
(?:$Storage\s+)?[HLP]?LIST_HEAD\s*\(|
+   (?:$Storage\s+)?[A-Z_]*_ON_STACK\(|
(?:$Storage\s+)?${Type}\s+uninitialized_var\s*\(
 )};
 
-- 
2.7.4