[RFC PATCH cryptodev] crypto: sec_send_request() can be static
Fixes: 915e4e8413da ("crypto: hisilicon - SEC security accelerator driver") Signed-off-by: kbuild test robot --- sec_algs.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/crypto/hisilicon/sec/sec_algs.c b/drivers/crypto/hisilicon/sec/sec_algs.c index d69d3ce..f7d6d69 100644 --- a/drivers/crypto/hisilicon/sec/sec_algs.c +++ b/drivers/crypto/hisilicon/sec/sec_algs.c @@ -393,7 +393,7 @@ static void sec_alg_free_el(struct sec_request_el *el, } /* queuelock must be held */ -int sec_send_request(struct sec_request *sec_req, struct sec_queue *queue) +static int sec_send_request(struct sec_request *sec_req, struct sec_queue *queue) { struct sec_request_el *el, *temp; int ret = 0;
[cryptodev:master 99/119] drivers/crypto/hisilicon/sec/sec_algs.c:396:5: sparse: symbol 'sec_send_request' was not declared. Should it be static?
tree: https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git master head: a94bfd5f50ddf114594f5c7f99686f1cfa2b7a76 commit: 915e4e8413dacc086efcef4de04fdfdca57e8b1c [99/119] crypto: hisilicon - SEC security accelerator driver reproduce: # apt-get install sparse git checkout 915e4e8413dacc086efcef4de04fdfdca57e8b1c make ARCH=x86_64 allmodconfig make C=1 CF=-D__CHECK_ENDIAN__ sparse warnings: (new ones prefixed by >>) >> drivers/crypto/hisilicon/sec/sec_algs.c:396:5: sparse: symbol >> 'sec_send_request' was not declared. Should it be static? include/linux/slab.h:631:13: sparse: undefined identifier '__builtin_mul_overflow' include/linux/slab.h:631:13: sparse: not a function include/linux/slab.h:631:13: sparse: not a function include/linux/slab.h:631:13: sparse: call with no type! Please review and possibly fold the followup patch. --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation
Re: [PATCH v2 0/3] crypto/arm64: aes-ce-gcm - switch to 2-way aggregation
On 3 August 2018 at 17:47, Herbert Xu wrote: > On Mon, Jul 30, 2018 at 11:06:39PM +0200, Ard Biesheuvel wrote: >> Update the combined AES-GCM AEAD implementation to process two blocks >> at a time, allowing us to switch to a faster version of the GHASH >> implementation. >> >> Note that this does not update the core GHASH transform, only the >> combined AES-GCM AEAD mode. GHASH is mostly used with AES anyway, and >> the ARMv8 architecture mandates support for AES instructions if >> 64-bit polynomial multiplication instructions are implemented. This >> means that mosts users of the pmull.p64 based GHASH routines are better >> off using the combined AES-GCM code anyway. Users of the pmull.p8 based >> GHASH implementation are unlikely to benefit substantially from aggregation, >> given that the multiplication phase is much more dominant in this case >> (and it is only the reduction phase that is amortized over multiple >> blocks) >> >> Performance numbers for Cortex-A53 can be found after patches #2 and #3. >> >> Changes since v1: >> - rebase to take the changes in patch 'crypto: arm64 - revert NEON yield for >> fast AEAD implementations' which I sent out on July 29th >> - add a patch to reduce the number of invocations of kernel_neon_begin() >> and kernel_neon_end() on the common path >> >> Ard Biesheuvel (3): >> crypto/arm64: aes-ce-gcm - operate on two input blocks at a time >> crypto/arm64: aes-ce-gcm - implement 2-way aggregation >> crypto: arm64/aes-ce-gcm - don't reload key schedule if avoidable >> >> arch/arm64/crypto/ghash-ce-core.S | 136 +-- >> arch/arm64/crypto/ghash-ce-glue.c | 176 >> 2 files changed, 198 insertions(+), 114 deletions(-) > > All applied. Thanks. Thanks Herbert.
Re: [PATCH] crypto: arm64 - revert NEON yield for fast AEAD implementations
On Sun, Jul 29, 2018 at 04:52:30PM +0200, Ard Biesheuvel wrote: > As it turns out, checking the TIF_NEED_RESCHED flag after each > iteration results in a significant performance regression (~10%) > when running fast algorithms (i.e., ones that use special instructions > and operate in the < 4 cycles per byte range) on in-order cores with > comparatively slow memory accesses such as the Cortex-A53. > > Given the speed of these ciphers, and the fact that the page based > nature of the AEAD scatterwalk API guarantees that the core NEON > transform is never invoked with more than a single page's worth of > input, we can estimate the worst case duration of any resulting > scheduling blackout: on a 1 GHz Cortex-A53 running with 64k pages, > processing a page's worth of input at 4 cycles per byte results in > a delay of ~250 us, which is a reasonable upper bound. > > So let's remove the yield checks from the fused AES-CCM and AES-GCM > routines entirely. > > This reverts commit 7b67ae4d5ce8e2f912377f5fbccb95811a92097f and > partially reverts commit 7c50136a8aba8784f07fb66a950cc61a7f3d2ee3. > > Fixes: 7c50136a8aba ("crypto: arm64/aes-ghash - yield NEON after every ...") > Fixes: 7b67ae4d5ce8 ("crypto: arm64/aes-ccm - yield NEON after every ...") > Signed-off-by: Ard Biesheuvel Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH v2 0/3] crypto/arm64: aes-ce-gcm - switch to 2-way aggregation
On Mon, Jul 30, 2018 at 11:06:39PM +0200, Ard Biesheuvel wrote: > Update the combined AES-GCM AEAD implementation to process two blocks > at a time, allowing us to switch to a faster version of the GHASH > implementation. > > Note that this does not update the core GHASH transform, only the > combined AES-GCM AEAD mode. GHASH is mostly used with AES anyway, and > the ARMv8 architecture mandates support for AES instructions if > 64-bit polynomial multiplication instructions are implemented. This > means that mosts users of the pmull.p64 based GHASH routines are better > off using the combined AES-GCM code anyway. Users of the pmull.p8 based > GHASH implementation are unlikely to benefit substantially from aggregation, > given that the multiplication phase is much more dominant in this case > (and it is only the reduction phase that is amortized over multiple > blocks) > > Performance numbers for Cortex-A53 can be found after patches #2 and #3. > > Changes since v1: > - rebase to take the changes in patch 'crypto: arm64 - revert NEON yield for > fast AEAD implementations' which I sent out on July 29th > - add a patch to reduce the number of invocations of kernel_neon_begin() > and kernel_neon_end() on the common path > > Ard Biesheuvel (3): > crypto/arm64: aes-ce-gcm - operate on two input blocks at a time > crypto/arm64: aes-ce-gcm - implement 2-way aggregation > crypto: arm64/aes-ce-gcm - don't reload key schedule if avoidable > > arch/arm64/crypto/ghash-ce-core.S | 136 +-- > arch/arm64/crypto/ghash-ce-glue.c | 176 > 2 files changed, 198 insertions(+), 114 deletions(-) All applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH v2 1/2] crypto: dh - fix calculating encoded key size
On Fri, Jul 27, 2018 at 03:36:10PM -0700, Eric Biggers wrote: > From: Eric Biggers > > It was forgotten to increase DH_KPP_SECRET_MIN_SIZE to include 'q_size', > causing an out-of-bounds write of 4 bytes in crypto_dh_encode_key(), and > an out-of-bounds read of 4 bytes in crypto_dh_decode_key(). Fix it, and > fix the lengths of the test vectors to match this. > > Reported-by: syzbot+6d38d558c25b53b8f...@syzkaller.appspotmail.com > Fixes: e3fe0ae12962 ("crypto: dh - add public key verification test") > Signed-off-by: Eric Biggers Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH] crypto: ccp: Check for NULL PSP pointer at module unload
Tom Lendacky wrote: > Should the PSP initialization fail, the PSP data structure will be > freed and the value contained in the sp_device struct set to NULL. > At module unload, psp_dev_destroy() does not check if the pointer > value is NULL and will end up dereferencing a NULL pointer. > > Add a pointer check of the psp_data field in the sp_device struct > in psp_dev_destroy() and return immediately if it is NULL. > > Cc: # 4.16.x- > Fixes: 2a6170dfe755 ("crypto: ccp: Add Platform Security Processor (PSP) > device support") > Signed-off-by: Tom Lendacky Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH v2 2/2] crypto: dh - make crypto_dh_encode_key() make robust
On Fri, Jul 27, 2018 at 03:36:11PM -0700, Eric Biggers wrote: > From: Eric Biggers > > Make it return -EINVAL if crypto_dh_key_len() is incorrect rather than > overflowing the buffer. > > Signed-off-by: Eric Biggers Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH] crypto: arm/chacha20 - always use vrev for 16-bit rotates
On Tue, Jul 24, 2018 at 06:29:07PM -0700, Eric Biggers wrote: > From: Eric Biggers > > The 4-way ChaCha20 NEON code implements 16-bit rotates with vrev32.16, > but the one-way code (used on remainder blocks) implements it with > vshl + vsri, which is slower. Switch the one-way code to vrev32.16 too. > > Signed-off-by: Eric Biggers Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH 0/3] crypto: fix crash in scatterwalk_pagedone()
On Mon, Jul 23, 2018 at 10:54:55AM -0700, Eric Biggers wrote: > From: Eric Biggers > > This series fixes the bug reported by Liu Chao (found using syzkaller) > where a crash occurs in scatterwalk_pagedone() on architectures such as > arm and arm64 that implement flush_dcache_page(), due to an invalid page > pointer when walk->offset == 0. This series attempts to address the > underlying problem which is that scatterwalk_pagedone() shouldn't have > been called at all in that case. > > Eric Biggers (3): > crypto: skcipher - fix crash flushing dcache in error path > crypto: blkcipher - fix crash flushing dcache in error path > crypto: ablkcipher - fix crash flushing dcache in error path > > crypto/ablkcipher.c | 57 + > crypto/blkcipher.c | 54 +- > crypto/skcipher.c | 53 - > 3 files changed, 79 insertions(+), 85 deletions(-) All applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH] crypto: skcipher - remove unnecessary setting of walk->nbytes
On Mon, Jul 23, 2018 at 10:21:29AM -0700, Eric Biggers wrote: > From: Eric Biggers > > Setting 'walk->nbytes = walk->total' in skcipher_walk_first() doesn't > make sense because actually walk->nbytes needs to be set to the length > of the first step in the walk, which may be less than walk->total. This > is done by skcipher_walk_next() which is called immediately afterwards. > Also walk->nbytes was already set to 0 in skcipher_walk_skcipher(), > which is a better default value in case it's forgotten to be set later. > > Therefore, remove the unnecessary assignment to walk->nbytes. > > Signed-off-by: Eric Biggers Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH] crypto: scatterwalk - remove scatterwalk_samebuf()
On Mon, Jul 23, 2018 at 10:04:28AM -0700, Eric Biggers wrote: > From: Eric Biggers > > scatterwalk_samebuf() is never used. Remove it. > > Signed-off-by: Eric Biggers Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH] crypto: skcipher - fix aligning block size in skcipher_copy_iv()
On Mon, Jul 23, 2018 at 09:57:50AM -0700, Eric Biggers wrote: > From: Eric Biggers > > The ALIGN() macro needs to be passed the alignment, not the alignmask > (which is the alignment minus 1). > > Fixes: b286d8b1a690 ("crypto: skcipher - Add skcipher walk interface") > Cc: # v4.10+ > Signed-off-by: Eric Biggers Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH] crypto: scatterwalk - remove 'chain' argument from scatterwalk_crypto_chain()
On Mon, Jul 23, 2018 at 10:01:33AM -0700, Eric Biggers wrote: > From: Eric Biggers > > All callers pass chain=0 to scatterwalk_crypto_chain(). > > Remove this unneeded parameter. > > Signed-off-by: Eric Biggers Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH] crypto: tcrypt - reschedule during speed tests
On Mon, Jul 23, 2018 at 05:18:48PM +0300, Horia Geantă wrote: > Avoid RCU stalls in the case of non-preemptible kernel and lengthy > speed tests by rescheduling when advancing from one block size > to another. > > Signed-off-by: Horia Geantă Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH V2 0/3] Hisilicon SEC crypto driver (hip06 / hip07)
On Mon, Jul 23, 2018 at 04:49:52PM +0100, Jonathan Cameron wrote: > The driver provides in kernel support for the Hisilicon SEC accelerator > found in the hip06 and hip07 SoCs. There are 4 such units on the D05 > board for which an appropriate DT binding has been provided. ACPI also > works with an appropriate UEFI build. > > The hardware does not update the IV in chaining or counting modes. > This is done in the drive ron completion of the cipher operation. > > The driver support AES, DES and 3DES block ciphers in a range of > modes (others to follow). Hash and AAED support to follow. > > Sorry for the delay on this one, other priorities and all that... > > Changes since V1. > 1) DT binding fixes suggested by Rob Herring in patches 1 and 3. > 2) Added XTS key check as suggested by Stephan Muller. > 3) A trivial use after free found during testing of the above. > > Changes since RFC. > 1) Addition of backlog queuing as needed to support dm-crypt usecases. > 2) iommu presence tests now done as Robin Murphy suggested. > 3) Hardware limiation to 32MB requests worked aroud in driver so it will >now support very large requests (512*32MB). Larger request handling >than this would require a longer queue with the associate overheads and >is considered unlikely to be necessary. > 4) The specific handling related to the inline IV patch set from Stephan >has been dropped for now. > 5) Interrupt handler was previous more complex than necessary so has been >reworked. > 6) Use of the bounce buffer for small packeets is dropped for now. This is a >performance optimization that made the code harder to review and can be >reintroduced as necessary at a later date. > 7) Restructuring of some code to simplify hash and aaed (hash implemented >but not ready fo upstream at this time) > 8) Various minor fixes and reworks of the code >* several off by one errors in the cleanup paths >* single template for enc and dec >* drop dec_key as not used (enc_key was used in both cases) >* drop dma pool for IVs as it breaks chaining. >* lots of spinlocks changed to mutexes as not taken in atomic context. >* nasty memory leak cleaned up. > > Jonathan Cameron (3): > dt-bindings: Add bindings for Hisilicon SEC crypto accelerators. > crypto: hisilicon SEC security accelerator driver > arm64: dts: hisi: add SEC crypto accelerator nodes for hip07 SoC > > .../bindings/crypto/hisilicon,hip07-sec.txt| 67 + > arch/arm64/boot/dts/hisilicon/hip07.dtsi | 284 + > drivers/crypto/Kconfig |2 + > drivers/crypto/Makefile|1 + > drivers/crypto/hisilicon/Kconfig | 14 + > drivers/crypto/hisilicon/Makefile |2 + > drivers/crypto/hisilicon/sec/Makefile |3 + > drivers/crypto/hisilicon/sec/sec_algs.c| 1122 + > drivers/crypto/hisilicon/sec/sec_drv.c | 1323 > > drivers/crypto/hisilicon/sec/sec_drv.h | 428 +++ > 10 files changed, 3246 insertions(+) > create mode 100644 > Documentation/devicetree/bindings/crypto/hisilicon,hip07-sec.txt > create mode 100644 drivers/crypto/hisilicon/Kconfig > create mode 100644 drivers/crypto/hisilicon/Makefile > create mode 100644 drivers/crypto/hisilicon/sec/Makefile > create mode 100644 drivers/crypto/hisilicon/sec/sec_algs.c > create mode 100644 drivers/crypto/hisilicon/sec/sec_drv.c > create mode 100644 drivers/crypto/hisilicon/sec/sec_drv.h All applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH] crypto: CTR DRBG - in-place cipher operation
On Fri, Jul 20, 2018 at 07:42:01PM +0200, Stephan Müller wrote: > The cipher implementations of the kernel crypto API favor in-place > cipher operations. Thus, switch the CTR cipher operation in the DRBG to > perform in-place operations. This is implemented by using the output > buffer as input buffer and zeroizing it before the cipher operation to > implement a CTR encryption of a NULL buffer. > > The speed improvement is quite visibile with the following comparison > using the LRNG implementation. > > Without the patch set: > > 16 bytes| 12.267661 MB/s|61338304 bytes | 500213 ns > 32 bytes| 23.603770 MB/s| 118018848 bytes | 500073 ns > 64 bytes| 46.732262 MB/s| 233661312 bytes | 500241 ns > 128 bytes| 90.038042 MB/s| 450190208 bytes | 500244 ns > 256 bytes| 160.399616 MB/s| 801998080 bytes | 500393 ns > 512 bytes| 259.878400 MB/s| 1299392000 bytes | 501675 ns > 1024 bytes| 386.050662 MB/s| 1930253312 bytes | 501661 ns > 2048 bytes| 493.641728 MB/s| 2468208640 bytes | 501598 ns > 4096 bytes| 581.835981 MB/s| 2909179904 bytes | 503426 ns > > With the patch set: > > 16 bytes | 17.051142 MB/s | 85255712 bytes | 500854 ns > 32 bytes | 32.695898 MB/s |163479488 bytes | 500544 ns > 64 bytes | 64.490739 MB/s |322453696 bytes | 500954 ns > 128 bytes |123.285043 MB/s |616425216 bytes | 500201 ns > 256 bytes |233.434573 MB/s | 1167172864 bytes | 500573 ns > 512 bytes |384.405197 MB/s | 1922025984 bytes | 500671 ns > 1024 bytes |566.313370 MB/s | 2831566848 bytes | 501080 ns > 2048 bytes |744.518042 MB/s | 3722590208 bytes | 500926 ns > 4096 bytes |867.501670 MB/s | 4337508352 bytes | 502181 ns > > Signed-off-by: Stephan Mueller Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
[PATCH v2] crypto: x86/aegis,morus - Fix and simplify CPUID checks
It turns out I had misunderstood how the x86_match_cpu() function works. It evaluates a logical OR of the matching conditions, not logical AND. This caused the CPU feature checks for AEGIS to pass even if only SSE2 (but not AES-NI) was supported (or vice versa), leading to potential crashes if something tried to use the registered algs. This patch switches the checks to a simpler method that is used e.g. in the Camellia x86 code. The patch also removes the MODULE_DEVICE_TABLE declarations which actually seem to cause the modules to be auto-loaded at boot, which is not desired. The crypto API on-demand module loading is sufficient. Fixes: 1d373d4e8e15 ("crypto: x86 - Add optimized AEGIS implementations") Fixes: 6ecc9d9ff91f ("crypto: x86 - Add optimized MORUS implementations") Signed-off-by: Ondrej Mosnacek --- arch/x86/crypto/aegis128-aesni-glue.c | 12 arch/x86/crypto/aegis128l-aesni-glue.c | 12 arch/x86/crypto/aegis256-aesni-glue.c | 12 arch/x86/crypto/morus1280-avx2-glue.c | 10 +++--- arch/x86/crypto/morus1280-sse2-glue.c | 10 +++--- arch/x86/crypto/morus640-sse2-glue.c | 10 +++--- 6 files changed, 21 insertions(+), 45 deletions(-) diff --git a/arch/x86/crypto/aegis128-aesni-glue.c b/arch/x86/crypto/aegis128-aesni-glue.c index 5de7c0d46edf..acd11b3bf639 100644 --- a/arch/x86/crypto/aegis128-aesni-glue.c +++ b/arch/x86/crypto/aegis128-aesni-glue.c @@ -375,16 +375,12 @@ static struct aead_alg crypto_aegis128_aesni_alg[] = { } }; -static const struct x86_cpu_id aesni_cpu_id[] = { - X86_FEATURE_MATCH(X86_FEATURE_AES), - X86_FEATURE_MATCH(X86_FEATURE_XMM2), - {} -}; -MODULE_DEVICE_TABLE(x86cpu, aesni_cpu_id); - static int __init crypto_aegis128_aesni_module_init(void) { - if (!x86_match_cpu(aesni_cpu_id)) + if (!boot_cpu_has(X86_FEATURE_XMM2) || + !boot_cpu_has(X86_FEATURE_AES) || + !boot_cpu_has(X86_FEATURE_OSXSAVE) || + !cpu_has_xfeatures(XFEATURE_MASK_SSE, NULL)) return -ENODEV; return crypto_register_aeads(crypto_aegis128_aesni_alg, diff --git a/arch/x86/crypto/aegis128l-aesni-glue.c b/arch/x86/crypto/aegis128l-aesni-glue.c index 876e4866e633..2071c3d1ae07 100644 --- a/arch/x86/crypto/aegis128l-aesni-glue.c +++ b/arch/x86/crypto/aegis128l-aesni-glue.c @@ -375,16 +375,12 @@ static struct aead_alg crypto_aegis128l_aesni_alg[] = { } }; -static const struct x86_cpu_id aesni_cpu_id[] = { - X86_FEATURE_MATCH(X86_FEATURE_AES), - X86_FEATURE_MATCH(X86_FEATURE_XMM2), - {} -}; -MODULE_DEVICE_TABLE(x86cpu, aesni_cpu_id); - static int __init crypto_aegis128l_aesni_module_init(void) { - if (!x86_match_cpu(aesni_cpu_id)) + if (!boot_cpu_has(X86_FEATURE_XMM2) || + !boot_cpu_has(X86_FEATURE_AES) || + !boot_cpu_has(X86_FEATURE_OSXSAVE) || + !cpu_has_xfeatures(XFEATURE_MASK_SSE, NULL)) return -ENODEV; return crypto_register_aeads(crypto_aegis128l_aesni_alg, diff --git a/arch/x86/crypto/aegis256-aesni-glue.c b/arch/x86/crypto/aegis256-aesni-glue.c index 2b5dd3af8f4d..b5f2a8fd5a71 100644 --- a/arch/x86/crypto/aegis256-aesni-glue.c +++ b/arch/x86/crypto/aegis256-aesni-glue.c @@ -375,16 +375,12 @@ static struct aead_alg crypto_aegis256_aesni_alg[] = { } }; -static const struct x86_cpu_id aesni_cpu_id[] = { - X86_FEATURE_MATCH(X86_FEATURE_AES), - X86_FEATURE_MATCH(X86_FEATURE_XMM2), - {} -}; -MODULE_DEVICE_TABLE(x86cpu, aesni_cpu_id); - static int __init crypto_aegis256_aesni_module_init(void) { - if (!x86_match_cpu(aesni_cpu_id)) + if (!boot_cpu_has(X86_FEATURE_XMM2) || + !boot_cpu_has(X86_FEATURE_AES) || + !boot_cpu_has(X86_FEATURE_OSXSAVE) || + !cpu_has_xfeatures(XFEATURE_MASK_SSE, NULL)) return -ENODEV; return crypto_register_aeads(crypto_aegis256_aesni_alg, diff --git a/arch/x86/crypto/morus1280-avx2-glue.c b/arch/x86/crypto/morus1280-avx2-glue.c index f111f36d26dc..6634907d6ccd 100644 --- a/arch/x86/crypto/morus1280-avx2-glue.c +++ b/arch/x86/crypto/morus1280-avx2-glue.c @@ -37,15 +37,11 @@ asmlinkage void crypto_morus1280_avx2_final(void *state, void *tag_xor, MORUS1280_DECLARE_ALGS(avx2, "morus1280-avx2", 400); -static const struct x86_cpu_id avx2_cpu_id[] = { -X86_FEATURE_MATCH(X86_FEATURE_AVX2), -{} -}; -MODULE_DEVICE_TABLE(x86cpu, avx2_cpu_id); - static int __init crypto_morus1280_avx2_module_init(void) { - if (!x86_match_cpu(avx2_cpu_id)) + if (!boot_cpu_has(X86_FEATURE_AVX2) || + !boot_cpu_has(X86_FEATURE_OSXSAVE) || + !cpu_has_xfeatures(XFEATURE_MASK_SSE | XFEATURE_MASK_YMM, NULL)) return -ENODEV; return crypto_register_aeads(crypto_morus1280_avx2_algs, diff --git a/arch/x86/crypto/morus1280-sse2-glue.c b/arch/x86/crypto/morus1280-sse2-glue.c index 839270aa713c..95cf857d2cbb
Re: [PATCH] crypto: x86/aegis - Fix CPUID checks
On Thu, Aug 2, 2018 at 11:17 AM Ondrej Mosnacek wrote: > It turns out I had misunderstood how the x86_match_cpu() function works. > It evaluates a logical OR of the matching conditions, not logical AND. > This caused the CPU feature checks to pass even if only SSE2 (but not > AES-NI) was supported (ir vice versa), leading to potential crashes if > something tried to use the registered algs. > > This patch fixes the checks to ensure that both required CPU features > are supported. The MODULE_DEVICE_TABLE declaration is specified only for > the AES-NI feature array, because I'm not sure what having multiple such > declarations would cause and I believe it should suffice to match on the > more important feature at the alias level and let the init routine do > the full check. > > Signed-off-by: Ondrej Mosnacek > --- > Hi Herbert, > > this patch fixes the CPU checks of the AEGIS AES-NI/SSE2 implementations > that have been introduced in 4.18. Once reviewed, it should go to Linus' > tree since it may cause crashes on some systems if the corresponding > configs are enabled. > > @x86 folks, please take a look if I use the MODULE_DEVICE_TABLE macro > correctly here (I'm not sure how to properly declare that the module > needs two CPU features to be both supported; all other modules I saw had > either only single match rule or required just one of the rules to > match). Never mind, I realized MODULE_DEVICE_TABLE isn't actually needed at all here (most modules in arch/x86/crypto don't even use it) and IIUC it will cause the modules to load automatically on boot (which isn't desirable for these new algorithms). I'll send a v2 of the patch that converts both AEGIS and MORUS checks to what most other modules use (e.g. the Camellia x86 optimizations). > > Thanks, > > Ondrej Mosnacek > > arch/x86/crypto/aegis128-aesni-glue.c | 8 ++-- > arch/x86/crypto/aegis128l-aesni-glue.c | 8 ++-- > arch/x86/crypto/aegis256-aesni-glue.c | 8 ++-- > 3 files changed, 18 insertions(+), 6 deletions(-) > > diff --git a/arch/x86/crypto/aegis128-aesni-glue.c > b/arch/x86/crypto/aegis128-aesni-glue.c > index 5de7c0d46edf..6a5abed59593 100644 > --- a/arch/x86/crypto/aegis128-aesni-glue.c > +++ b/arch/x86/crypto/aegis128-aesni-glue.c > @@ -377,14 +377,18 @@ static struct aead_alg crypto_aegis128_aesni_alg[] = { > > static const struct x86_cpu_id aesni_cpu_id[] = { > X86_FEATURE_MATCH(X86_FEATURE_AES), > - X86_FEATURE_MATCH(X86_FEATURE_XMM2), > {} > }; > MODULE_DEVICE_TABLE(x86cpu, aesni_cpu_id); > > +static const struct x86_cpu_id sse2_cpu_id[] = { > + X86_FEATURE_MATCH(X86_FEATURE_XMM2), > + {} > +}; > + > static int __init crypto_aegis128_aesni_module_init(void) > { > - if (!x86_match_cpu(aesni_cpu_id)) > + if (!x86_match_cpu(aesni_cpu_id) || !x86_match_cpu(sse2_cpu_id)) > return -ENODEV; > > return crypto_register_aeads(crypto_aegis128_aesni_alg, > diff --git a/arch/x86/crypto/aegis128l-aesni-glue.c > b/arch/x86/crypto/aegis128l-aesni-glue.c > index 876e4866e633..691c52701c2d 100644 > --- a/arch/x86/crypto/aegis128l-aesni-glue.c > +++ b/arch/x86/crypto/aegis128l-aesni-glue.c > @@ -377,14 +377,18 @@ static struct aead_alg crypto_aegis128l_aesni_alg[] = { > > static const struct x86_cpu_id aesni_cpu_id[] = { > X86_FEATURE_MATCH(X86_FEATURE_AES), > - X86_FEATURE_MATCH(X86_FEATURE_XMM2), > {} > }; > MODULE_DEVICE_TABLE(x86cpu, aesni_cpu_id); > > +static const struct x86_cpu_id sse2_cpu_id[] = { > + X86_FEATURE_MATCH(X86_FEATURE_XMM2), > + {} > +}; > + > static int __init crypto_aegis128l_aesni_module_init(void) > { > - if (!x86_match_cpu(aesni_cpu_id)) > + if (!x86_match_cpu(aesni_cpu_id) || !x86_match_cpu(sse2_cpu_id)) > return -ENODEV; > > return crypto_register_aeads(crypto_aegis128l_aesni_alg, > diff --git a/arch/x86/crypto/aegis256-aesni-glue.c > b/arch/x86/crypto/aegis256-aesni-glue.c > index 2b5dd3af8f4d..481b5e4f4fd0 100644 > --- a/arch/x86/crypto/aegis256-aesni-glue.c > +++ b/arch/x86/crypto/aegis256-aesni-glue.c > @@ -377,14 +377,18 @@ static struct aead_alg crypto_aegis256_aesni_alg[] = { > > static const struct x86_cpu_id aesni_cpu_id[] = { > X86_FEATURE_MATCH(X86_FEATURE_AES), > - X86_FEATURE_MATCH(X86_FEATURE_XMM2), > {} > }; > MODULE_DEVICE_TABLE(x86cpu, aesni_cpu_id); > > +static const struct x86_cpu_id sse2_cpu_id[] = { > + X86_FEATURE_MATCH(X86_FEATURE_XMM2), > + {} > +}; > + > static int __init crypto_aegis256_aesni_module_init(void) > { > - if (!x86_match_cpu(aesni_cpu_id)) > + if (!x86_match_cpu(aesni_cpu_id) || !x86_match_cpu(sse2_cpu_id)) > return -ENODEV; > > return crypto_register_aeads(crypto_aegis256_aesni_alg, > -- > 2.17.1 > -- Ondrej Mosnacek Associate Software Engineer, Security Technologies Red Hat, Inc.
Re: [PATCH] crypto: arm64 - revert NEON yield for fast AEAD implementations
On 3 August 2018 at 10:17, Herbert Xu wrote: > On Fri, Aug 03, 2018 at 09:10:08AM +0200, Ard Biesheuvel wrote: >> But I think it's too late now to take this into v4.18. Could you >> please queue this (and my other two pending arm64/aes-gcm patches, if >> possible) for v4.19 instead? > > OK I'll do that. > Thanks. But, actually, those two pending patches are 3 piece series now ... (v2 of 'crypto/arm64: aes-ce-gcm - switch to 2-way aggregation') Thanks, Ard.
Re: [PATCH] crypto: arm64 - revert NEON yield for fast AEAD implementations
On Fri, Aug 03, 2018 at 09:10:08AM +0200, Ard Biesheuvel wrote: > But I think it's too late now to take this into v4.18. Could you > please queue this (and my other two pending arm64/aes-gcm patches, if > possible) for v4.19 instead? OK I'll do that. Cheers, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH] crypto: arm64 - revert NEON yield for fast AEAD implementations
On 3 August 2018 at 08:14, Herbert Xu wrote: > On Sun, Jul 29, 2018 at 04:52:30PM +0200, Ard Biesheuvel wrote: >> As it turns out, checking the TIF_NEED_RESCHED flag after each >> iteration results in a significant performance regression (~10%) >> when running fast algorithms (i.e., ones that use special instructions >> and operate in the < 4 cycles per byte range) on in-order cores with >> comparatively slow memory accesses such as the Cortex-A53. >> >> Given the speed of these ciphers, and the fact that the page based >> nature of the AEAD scatterwalk API guarantees that the core NEON >> transform is never invoked with more than a single page's worth of >> input, we can estimate the worst case duration of any resulting >> scheduling blackout: on a 1 GHz Cortex-A53 running with 64k pages, >> processing a page's worth of input at 4 cycles per byte results in >> a delay of ~250 us, which is a reasonable upper bound. >> >> So let's remove the yield checks from the fused AES-CCM and AES-GCM >> routines entirely. >> >> This reverts commit 7b67ae4d5ce8e2f912377f5fbccb95811a92097f and >> partially reverts commit 7c50136a8aba8784f07fb66a950cc61a7f3d2ee3. >> >> Fixes: 7c50136a8aba ("crypto: arm64/aes-ghash - yield NEON after every ...") >> Fixes: 7b67ae4d5ce8 ("crypto: arm64/aes-ccm - yield NEON after every ...") >> Signed-off-by: Ard Biesheuvel >> --- >> This supersedes my series 'crypto/arm64: reduce impact of NEON yield checks' >> sent out on the 24th of July. >> >> Given the significant performance regression, we may want to treat this as >> a fix (the patches in question went into v4.18) >> >> This patch applies onto my patch 'crypto/arm64: aes-ce-gcm - add missing >> kernel_neon_begin/end pair' which I sent out on the 27th of July, which >> fixes a data corruption bug, whic should also be applied as a fix. > > Acked-by: Herbert Xu > Thanks Herbert. But I think it's too late now to take this into v4.18. Could you please queue this (and my other two pending arm64/aes-gcm patches, if possible) for v4.19 instead? Thanks,
Re: [PATCH] crypto: arm64 - revert NEON yield for fast AEAD implementations
On Sun, Jul 29, 2018 at 04:52:30PM +0200, Ard Biesheuvel wrote: > As it turns out, checking the TIF_NEED_RESCHED flag after each > iteration results in a significant performance regression (~10%) > when running fast algorithms (i.e., ones that use special instructions > and operate in the < 4 cycles per byte range) on in-order cores with > comparatively slow memory accesses such as the Cortex-A53. > > Given the speed of these ciphers, and the fact that the page based > nature of the AEAD scatterwalk API guarantees that the core NEON > transform is never invoked with more than a single page's worth of > input, we can estimate the worst case duration of any resulting > scheduling blackout: on a 1 GHz Cortex-A53 running with 64k pages, > processing a page's worth of input at 4 cycles per byte results in > a delay of ~250 us, which is a reasonable upper bound. > > So let's remove the yield checks from the fused AES-CCM and AES-GCM > routines entirely. > > This reverts commit 7b67ae4d5ce8e2f912377f5fbccb95811a92097f and > partially reverts commit 7c50136a8aba8784f07fb66a950cc61a7f3d2ee3. > > Fixes: 7c50136a8aba ("crypto: arm64/aes-ghash - yield NEON after every ...") > Fixes: 7b67ae4d5ce8 ("crypto: arm64/aes-ccm - yield NEON after every ...") > Signed-off-by: Ard Biesheuvel > --- > This supersedes my series 'crypto/arm64: reduce impact of NEON yield checks' > sent out on the 24th of July. > > Given the significant performance regression, we may want to treat this as > a fix (the patches in question went into v4.18) > > This patch applies onto my patch 'crypto/arm64: aes-ce-gcm - add missing > kernel_neon_begin/end pair' which I sent out on the 27th of July, which > fixes a data corruption bug, whic should also be applied as a fix. Acked-by: Herbert Xu Thanks, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt