Re: [RFC PATCH 4/5] arm64: fpsimd: run kernel mode NEON with softirqs disabled

2021-01-20 Thread Dave Martin
On Tue, Jan 19, 2021 at 05:29:05PM +0100, Ard Biesheuvel wrote: > On Tue, 19 Jan 2021 at 17:01, Dave Martin wrote: > > > > On Fri, Dec 18, 2020 at 06:01:05PM +0100, Ard Biesheuvel wrote: > > > Kernel mode NEON can be used in task or softirq context, but only in > &g

Re: [RFC PATCH 4/5] arm64: fpsimd: run kernel mode NEON with softirqs disabled

2021-01-19 Thread Dave Martin
On Fri, Dec 18, 2020 at 06:01:05PM +0100, Ard Biesheuvel wrote: > Kernel mode NEON can be used in task or softirq context, but only in > a non-nesting manner, i.e., softirq context is only permitted if the > interrupt was not taken at a point where the kernel was using the NEON > in task context. >

Re: [PATCH 0/1] arm64: Accelerate Adler32 using arm64 SVE instructions.

2020-11-16 Thread Dave Martin
On Sat, Nov 14, 2020 at 03:31:56PM +0800, Li Qiang wrote: > > > 在 2020/11/12 19:17, Dave Martin 写道: > > On Thu, Nov 12, 2020 at 03:20:53PM +0800, Li Qiang wrote: > >> > >> > >> 在 2020/11/11 0:07, Dave Martin 写道: > >>>>>&g

Re: [PATCH 0/1] arm64: Accelerate Adler32 using arm64 SVE instructions.

2020-11-12 Thread Dave Martin
On Thu, Nov 12, 2020 at 03:20:53PM +0800, Li Qiang wrote: > > > 在 2020/11/11 0:07, Dave Martin 写道: > >>>>> add zA.s, pP/m, zA.s, zX.s// zA.s += zX.s > >>>>> > >>>>> msb zX.s,

Re: [PATCH 0/1] arm64: Accelerate Adler32 using arm64 SVE instructions.

2020-11-10 Thread Dave Martin
On Tue, Nov 10, 2020 at 09:20:46PM +0800, Li Qiang wrote: > > > 在 2020/11/10 18:46, Dave Martin 写道: > > On Mon, Nov 09, 2020 at 11:43:35AM +0800, Li Qiang wrote: > >> Hi Dave, > >> > >> I carefully read the ideas you provided and the sample code you g

Re: [PATCH 0/1] arm64: Accelerate Adler32 using arm64 SVE instructions.

2020-11-10 Thread Dave Martin
On Mon, Nov 09, 2020 at 11:43:35AM +0800, Li Qiang wrote: > Hi Dave, > > I carefully read the ideas you provided and the sample code you gave me.:) > > 在 2020/11/6 0:53, Dave Martin 写道: > > On Tue, Nov 03, 2020 at 08:15:05PM +0800, l00374334 wrote: > >> Fro

Re: [PATCH 1/1] arm64: Accelerate Adler32 using arm64 SVE instructions.

2020-11-05 Thread Dave Martin
On Wed, Nov 04, 2020 at 06:49:05PM +, Mark Brown wrote: > On Wed, Nov 04, 2020 at 06:13:06PM +0000, Dave Martin wrote: > > On Wed, Nov 04, 2020 at 05:50:33PM +, Mark Brown wrote: > > > > I think at a minimum we'd want to handle the vector length explicitly

Re: [PATCH 0/1] arm64: Accelerate Adler32 using arm64 SVE instructions.

2020-11-05 Thread Dave Martin
On Tue, Nov 03, 2020 at 08:15:05PM +0800, l00374334 wrote: > From: liqiang > > Dear all, > > Thank you for taking the precious time to read this email! > > Let me introduce the implementation ideas of my code here. > > In the process of using the compression library libz, I found that the adle

Re: [PATCH 1/1] arm64: Accelerate Adler32 using arm64 SVE instructions.

2020-11-04 Thread Dave Martin
On Wed, Nov 04, 2020 at 05:50:33PM +, Mark Brown wrote: > On Tue, Nov 03, 2020 at 06:00:32PM +0000, Dave Martin wrote: > > On Tue, Nov 03, 2020 at 03:34:27PM +0100, Ard Biesheuvel wrote: > > > > First of all, I don't think it is safe at the moment to use SVE in the

Re: [PATCH 1/1] arm64: Accelerate Adler32 using arm64 SVE instructions.

2020-11-04 Thread Dave Martin
On Wed, Nov 04, 2020 at 05:19:18PM +0800, Li Qiang wrote: > Hi Dave, > > Thank you very much for your reply and suggestions. :) > > 在 2020/11/4 2:00, Dave Martin 写道: > > On Tue, Nov 03, 2020 at 03:34:27PM +0100, Ard Biesheuvel wrote: > >> (+ Dave) > >> >

Re: [PATCH 1/1] arm64: Accelerate Adler32 using arm64 SVE instructions.

2020-11-03 Thread Dave Martin
On Tue, Nov 03, 2020 at 03:34:27PM +0100, Ard Biesheuvel wrote: > (+ Dave) > > Hello liqiang, > > First of all, I don't think it is safe at the moment to use SVE in the > kernel, as we don't preserve all state IIRC. My memory is a bit hazy, I'm not convinced that it's safe right now. SVE in the

Re: [BUG][PATCH v3] crypto: arm64: Use x16 with indirect branch to bti_c

2020-10-07 Thread Dave Martin
> crypto_skcipher_encrypt+0x50/0x84 > test_skcipher_vec_cfg+0x224/0x5f0 > test_skcipher+0xbc/0x120 > alg_test_skcipher+0xa0/0x1b0 > alg_test+0x3dc/0x47c > cryptomgr_test+0x38/0x60 > > Fixes: 0e89640b640d ("crypto: arm64 - Use modern annotations for assembly

Re: [BUG][PATCH] crypto: arm64: Avoid indirect branch to bti_c

2020-10-06 Thread Dave Martin
On Tue, Oct 06, 2020 at 11:25:11AM +0100, Catalin Marinas wrote: > On Tue, Oct 06, 2020 at 11:01:21AM +0100, Dave P Martin wrote: > > On Tue, Oct 06, 2020 at 09:27:48AM +0100, Will Deacon wrote: > > > On Mon, Oct 05, 2020 at 10:48:54PM -0500, Jeremy Linton wrote: > > > > The AES code uses a 'br x7'

Re: [BUG][PATCH] crypto: arm64: Avoid indirect branch to bti_c

2020-10-06 Thread Dave Martin
On Tue, Oct 06, 2020 at 09:27:48AM +0100, Will Deacon wrote: > On Mon, Oct 05, 2020 at 10:48:54PM -0500, Jeremy Linton wrote: > > The AES code uses a 'br x7' as part of a function called by > > a macro. That branch needs a bti_j as a target. This results > > in a panic as seen below. Instead of try

Re: [BUG][PATCH] arm64: bti: fix BTI to handle local indirect branches

2020-10-06 Thread Dave Martin
On Mon, Oct 05, 2020 at 02:24:47PM -0500, Jeremy Linton wrote: > Hi, > > On 10/5/20 1:54 PM, Ard Biesheuvel wrote: > >On Mon, 5 Oct 2020 at 20:18, Jeremy Linton wrote: > >> > >>The AES code uses a 'br x7' as part of a function called by > >>a macro, that ends up needing a BTI_J as a target. > > >

Re: [PATCH 1/4] crypto/arm64: ghash - reduce performance impact of NEON yield checks

2018-07-25 Thread Dave Martin
On Wed, Jul 25, 2018 at 10:11:42AM +0100, Ard Biesheuvel wrote: > On 25 July 2018 at 11:05, Dave Martin wrote: > > On Tue, Jul 24, 2018 at 06:12:21PM +0100, Ard Biesheuvel wrote: > >> As reported by Vakul, checking the TIF_NEED_RESCHED flag after every > >> iteration of

Re: [PATCH 0/4] crypto/arm64: reduce impact of NEON yield checks

2018-07-25 Thread Dave Martin
On Wed, Jul 25, 2018 at 10:23:00AM +0100, Ard Biesheuvel wrote: > On 25 July 2018 at 11:09, Dave Martin wrote: > > On Tue, Jul 24, 2018 at 06:12:20PM +0100, Ard Biesheuvel wrote: > >> Vakul reports a considerable performance hit when running the accelerated > >>

Re: [PATCH 0/4] crypto/arm64: reduce impact of NEON yield checks

2018-07-25 Thread Dave Martin
On Tue, Jul 24, 2018 at 06:12:20PM +0100, Ard Biesheuvel wrote: > Vakul reports a considerable performance hit when running the accelerated > arm64 crypto routines with CONFIG_PREEMPT=y configured, now that thay have > been updated to take the TIF_NEED_RESCHED flag into account. > > The issue appe

Re: [PATCH 1/4] crypto/arm64: ghash - reduce performance impact of NEON yield checks

2018-07-25 Thread Dave Martin
On Tue, Jul 24, 2018 at 06:12:21PM +0100, Ard Biesheuvel wrote: > As reported by Vakul, checking the TIF_NEED_RESCHED flag after every > iteration of the GHASH and AES-GCM core routines is having a considerable > performance impact on cores such as the Cortex-A53 with Crypto Extensions > implemente

Re: [RFC PATCH] crypto: arm64/speck - add NEON-accelerated implementation of Speck-XTS

2018-03-06 Thread Dave Martin
On Tue, Mar 06, 2018 at 12:47:45PM +, Ard Biesheuvel wrote: > On 6 March 2018 at 12:35, Dave Martin wrote: > > On Mon, Mar 05, 2018 at 11:17:07AM -0800, Eric Biggers wrote: > >> Add a NEON-accelerated implementation of Speck128-XTS and Speck64-XTS > >> for ARM64.

Re: [RFC PATCH] crypto: arm64/speck - add NEON-accelerated implementation of Speck-XTS

2018-03-06 Thread Dave Martin
On Mon, Mar 05, 2018 at 11:17:07AM -0800, Eric Biggers wrote: > Add a NEON-accelerated implementation of Speck128-XTS and Speck64-XTS > for ARM64. This is ported from the 32-bit version. It may be useful on > devices with 64-bit ARM CPUs that don't have the Cryptography > Extensions, so cannot do

Re: [PATCH v3 11/20] arm64: assembler: add macros to conditionally yield the NEON under PREEMPT

2017-12-07 Thread Dave Martin
On Thu, Dec 07, 2017 at 03:47:43PM +, Ard Biesheuvel wrote: > On 7 December 2017 at 14:50, Ard Biesheuvel wrote: > > On 7 December 2017 at 14:39, Dave Martin wrote: > >> On Wed, Dec 06, 2017 at 07:43:37PM +, Ard Biesheuvel wrote: [...]

Re: [PATCH v3 11/20] arm64: assembler: add macros to conditionally yield the NEON under PREEMPT

2017-12-07 Thread Dave Martin
On Thu, Dec 07, 2017 at 02:50:11PM +, Ard Biesheuvel wrote: > On 7 December 2017 at 14:39, Dave Martin wrote: > > On Wed, Dec 06, 2017 at 07:43:37PM +, Ard Biesheuvel wrote: > >> Add support macros to conditionally yield the NEON (and thus the CPU) > >>

Re: [PATCH v3 10/20] arm64: assembler: add utility macros to push/pop stack frames

2017-12-07 Thread Dave Martin
On Thu, Dec 07, 2017 at 02:21:17PM +, Ard Biesheuvel wrote: > On 7 December 2017 at 14:11, Dave Martin wrote: > > On Wed, Dec 06, 2017 at 07:43:36PM +, Ard Biesheuvel wrote: > >> We are going to add code to all the NEON crypto routines that will > >> turn them

Re: [PATCH v3 11/20] arm64: assembler: add macros to conditionally yield the NEON under PREEMPT

2017-12-07 Thread Dave Martin
On Wed, Dec 06, 2017 at 07:43:37PM +, Ard Biesheuvel wrote: > Add support macros to conditionally yield the NEON (and thus the CPU) > that may be called from the assembler code. > > In some cases, yielding the NEON involves saving and restoring a non > trivial amount of context (especially in

Re: [PATCH v3 10/20] arm64: assembler: add utility macros to push/pop stack frames

2017-12-07 Thread Dave Martin
On Wed, Dec 06, 2017 at 07:43:36PM +, Ard Biesheuvel wrote: > We are going to add code to all the NEON crypto routines that will > turn them into non-leaf functions, so we need to manage the stack > frames. To make this less tedious and error prone, add some macros > that take the number of cal

Re: [PATCH v2 11/19] arm64: assembler: add macro to conditionally yield the NEON under PREEMPT

2017-12-06 Thread Dave Martin
On Wed, Dec 06, 2017 at 12:25:44PM +, Ard Biesheuvel wrote: > On 6 December 2017 at 12:12, Dave P Martin wrote: > > On Wed, Dec 06, 2017 at 11:57:12AM +, Ard Biesheuvel wrote: > >> On 6 December 2017 at 11:51, Dave Martin wrote: > >> > On Tue, Dec 05,

Re: [PATCH v2 11/19] arm64: assembler: add macro to conditionally yield the NEON under PREEMPT

2017-12-06 Thread Dave Martin
On Tue, Dec 05, 2017 at 06:04:34PM +, Ard Biesheuvel wrote: > On 5 December 2017 at 12:45, Ard Biesheuvel wrote: > > > > > >> On 5 Dec 2017, at 12:28, Dave Martin wrote: > >> > >>> On Mon, Dec 04, 2017 at 12:26:37PM +, Ard Biesheuvel wrot

Re: [PATCH v2 11/19] arm64: assembler: add macro to conditionally yield the NEON under PREEMPT

2017-12-05 Thread Dave Martin
On Mon, Dec 04, 2017 at 12:26:37PM +, Ard Biesheuvel wrote: > Add a support macro to conditionally yield the NEON (and thus the CPU) > that may be called from the assembler code. Given that especially the > instruction based accelerated crypto code may use very tight loops, add > some parametri

Re: [PATCH resend 00/18] crypto: ARM/arm64 roundup for v4.14

2017-08-03 Thread Dave Martin
On Thu, Aug 03, 2017 at 02:26:53PM +0800, Herbert Xu wrote: > On Mon, Jul 24, 2017 at 11:28:02AM +0100, Ard Biesheuvel wrote: > > This is a resend of all the patches I sent out recently that I would > > like to be considered for v4.14. Their main purpose is to prepare the > > arm64 crypto code to d

Re: [PATCH resend 00/18] crypto: ARM/arm64 roundup for v4.14

2017-08-02 Thread Dave Martin
Hi Herbert, This series from Ard is a prerequisite for an arm64 series [1] that I'd like to get merged this cycle (because it is in turn a prerequisite for another major series I want to progress). [1] without this series will break the kernel, whereas this series without [1] won't break the kern

Re: may_use_simd on aarch64, chacha20

2017-05-26 Thread Dave Martin
On Fri, May 26, 2017 at 07:44:46PM +0200, Ard Biesheuvel wrote: > On 26 May 2017 at 15:28, Dave Martin wrote: > > On Sun, May 21, 2017 at 10:55:20PM +0200, Ard Biesheuvel wrote: > >> (+ Dave) [...] > >> > Lastly, APIs like pcrypts and padata execute with botto

Re: may_use_simd on aarch64, chacha20

2017-05-26 Thread Dave Martin
On Sun, May 21, 2017 at 10:55:20PM +0200, Ard Biesheuvel wrote: > (+ Dave) Apologies for the slow reply -- hopefully this is still useful. > > On 21 May 2017, at 19:02, Jason A. Donenfeld wrote: > > > > Hi folks, > > > > I noticed that the ARM implementation [1] of chacha20 makes a check to >