[PATCH] crypto: testmgr - don't DMA map IV from stack in test_skcipher()
Fix the "DMA-API: device driver maps memory from stack" warning generated when crypto accelerators map the IV. Signed-off-by: Horia Geantă --- crypto/testmgr.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/crypto/testmgr.c b/crypto/testmgr.c index 500a5277cc22..64245aeef634 100644 --- a/crypto/testmgr.c +++ b/crypto/testmgr.c @@ -1081,12 +1081,16 @@ static int __test_skcipher(struct crypto_skcipher *tfm, int enc, const char *e, *d; struct tcrypt_result result; void *data; - char iv[MAX_IVLEN]; + char *iv; char *xbuf[XBUFSIZE]; char *xoutbuf[XBUFSIZE]; int ret = -ENOMEM; unsigned int ivsize = crypto_skcipher_ivsize(tfm); + iv = kmalloc(MAX_IVLEN, GFP_KERNEL); + if (!iv) + return ret; + if (testmgr_alloc_buf(xbuf)) goto out_nobuf; @@ -1328,6 +1332,7 @@ static int __test_skcipher(struct crypto_skcipher *tfm, int enc, out_nooutbuf: testmgr_free_buf(xbuf); out_nobuf: + kfree(iv); return ret; } -- 2.4.4 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] virtio-crypto: adjust priority of algorithm
Some hardware accelerators (like intel aseni or the s390 cpacf functions) have lower priorities than virtio crypto, and those drivers are faster than the same in the host via virtio. So let's lower the priority of virtio-crypto's algorithm, make it's higher than sofeware implimentations but lower than the hardware ones. Suggested-by: Christian Borntraeger Signed-off-by: Gonglei --- drivers/crypto/virtio/virtio_crypto_algs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/crypto/virtio/virtio_crypto_algs.c b/drivers/crypto/virtio/virtio_crypto_algs.c index 6f40a42..4de4740 100644 --- a/drivers/crypto/virtio/virtio_crypto_algs.c +++ b/drivers/crypto/virtio/virtio_crypto_algs.c @@ -498,7 +498,7 @@ void virtio_crypto_ablkcipher_finalize_req( static struct crypto_alg virtio_crypto_algs[] = { { .cra_name = "cbc(aes)", .cra_driver_name = "virtio_crypto_aes_cbc", - .cra_priority = 501, + .cra_priority = 150, .cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER | CRYPTO_ALG_ASYNC, .cra_blocksize = AES_BLOCK_SIZE, .cra_ctxsize = sizeof(struct virtio_crypto_ablkcipher_ctx), -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: x86-64: Maintain 16-byte stack alignment
On Thu, Jan 12, 2017 at 08:37:18PM -0800, Linus Torvalds wrote: > On Jan 12, 2017 8:28 PM, "Josh Poimboeuf" wrote: > > > The stack frame was always 16-byte aligned regardless of whether the > buf array size was even or odd. > > > Including with -fomit-frame-pointer? > > With frame pointers, stack frames really are naturally 16 bytes, and then > keeping the frame 16-byte aligned is just a matter of making any extra > frame allocations or push/pop sequences that you do also be a multiple of > 16 bytes. > > But *without* frame pointers, the"native" frame size is just 8 bytes, and a > function that doesn't need any other local storage and then calls another > function (think various trivial wrapper functions that just add an argument > and then munge the return value) would thus naturally cause the frame to > become misaligned. > > So then the compiler actually needs to start adding useless instructions > just to keep the stack 16-byte aligned. Disabling frame pointers didn't seem to help, but I finally got it to misalign with a different test case. I think it had been aligning the array, so instead I made it push a register. void otherfunc(void); static inline void bar(int f) { register void *__sp asm(_ASM_SP); asm volatile("call otherfunc" : "+r" (__sp) : "b"(f)); } void foo(void) { bar(5); } 20f0 : 20f0: 55 push %rbp 20f1: 48 89 e5mov%rsp,%rbp 20f4: 53 push %rbx 20f5: bb 05 00 00 00 mov$0x5,%ebx 20fa: e8 00 00 00 00 callq 20ff 20fb: R_X86_64_PC32 otherfunc-0x4 20ff: 5b pop%rbx 2100: 5d pop%rbp 2101: c3 retq 2102: 0f 1f 40 00 nopl 0x0(%rax) 2106: 66 2e 0f 1f 84 00 00nopw %cs:0x0(%rax,%rax,1) 210d: 00 00 00 -- Josh -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: x86-64: Maintain 16-byte stack alignment
On Thu, Jan 12, 2017 at 07:23:18PM -0800, Andy Lutomirski wrote: > On Thu, Jan 12, 2017 at 7:11 PM, Josh Poimboeuf wrote: > > On Thu, Jan 12, 2017 at 05:46:55PM -0800, Andy Lutomirski wrote: > >> On Thu, Jan 12, 2017 at 12:15 PM, Josh Poimboeuf > >> wrote: > >> > On Thu, Jan 12, 2017 at 12:08:07PM -0800, Andy Lutomirski wrote: > >> >> On Thu, Jan 12, 2017 at 11:51 AM, Linus Torvalds > >> >> wrote: > >> >> > On Thu, Jan 12, 2017 at 6:02 AM, Josh Poimboeuf > >> >> > wrote: > >> >> >> > >> >> >> Just to clarify, I think you're asking if, for versions of gcc which > >> >> >> don't support -mpreferred-stack-boundary=3, objtool can analyze all C > >> >> >> functions to ensure their stacks are 16-byte aligned. > >> >> >> > >> >> >> It's certainly possible, but I don't see how that solves the problem. > >> >> >> The stack will still be misaligned by entry code. Or am I missing > >> >> >> something? > >> >> > > >> >> > I think the argument is that we *could* try to align things, if we > >> >> > just had some tool that actually then verified that we aren't missing > >> >> > anything. > >> >> > > >> >> > I'm not entirely happy with checking the generated code, though, > >> >> > because as Ingo says, you have a 50:50 chance of just getting it right > >> >> > by mistake. So I'd much rather have some static tool that checks > >> >> > things at a code level (ie coccinelle or sparse). > >> >> > >> >> What I meant was checking the entry code to see if it aligns stack > >> >> frames, and good luck getting sparse to do that. Hmm, getting 16-byte > >> >> alignment for real may actually be entirely a lost cause. After all, > >> >> I think we have some inline functions that do asm volatile ("call > >> >> ..."), and I don't see any credible way of forcing alignment short of > >> >> generating an entirely new stack frame and aligning that. > >> > > >> > Actually we already found all such cases and fixed them by forcing a new > >> > stack frame, thanks to objtool. For example, see 55a76b59b5fe. > >> > >> What I mean is: what guarantees that the stack is properly aligned for > >> the subroutine call? gcc promises to set up a stack frame, but does > >> it promise that rsp will be properly aligned to call a C function? > > > > Yes, I did an experiment and you're right. I had naively assumed that > > all stack frames would be aligned. > > Just to check: did you do your experiment with -mpreferred-stack-boundary=4? Yes, but it's too late for me to be doing hard stuff and I think my first experiment was bogus. I didn't use all the other kernel-specific gcc options. I tried again with all the kernel gcc options, except with -mpreferred-stack-boundary=4 instead of 3, and actually came up with the opposite conclusion. I used the following code: void otherfunc(void); static inline void bar(long *f) { asm volatile("call otherfunc" : : "m" (f) : ); } void foo(void) { long buf[3] = {0, 0, 0}; bar(buf); } The stack frame was always 16-byte aligned regardless of whether the buf array size was even or odd. So my half-asleep brain is telling me that my original assumption was right. -- Josh -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: x86-64: Maintain 16-byte stack alignment
On Thu, Jan 12, 2017 at 7:11 PM, Josh Poimboeuf wrote: > On Thu, Jan 12, 2017 at 05:46:55PM -0800, Andy Lutomirski wrote: >> On Thu, Jan 12, 2017 at 12:15 PM, Josh Poimboeuf wrote: >> > On Thu, Jan 12, 2017 at 12:08:07PM -0800, Andy Lutomirski wrote: >> >> On Thu, Jan 12, 2017 at 11:51 AM, Linus Torvalds >> >> wrote: >> >> > On Thu, Jan 12, 2017 at 6:02 AM, Josh Poimboeuf >> >> > wrote: >> >> >> >> >> >> Just to clarify, I think you're asking if, for versions of gcc which >> >> >> don't support -mpreferred-stack-boundary=3, objtool can analyze all C >> >> >> functions to ensure their stacks are 16-byte aligned. >> >> >> >> >> >> It's certainly possible, but I don't see how that solves the problem. >> >> >> The stack will still be misaligned by entry code. Or am I missing >> >> >> something? >> >> > >> >> > I think the argument is that we *could* try to align things, if we >> >> > just had some tool that actually then verified that we aren't missing >> >> > anything. >> >> > >> >> > I'm not entirely happy with checking the generated code, though, >> >> > because as Ingo says, you have a 50:50 chance of just getting it right >> >> > by mistake. So I'd much rather have some static tool that checks >> >> > things at a code level (ie coccinelle or sparse). >> >> >> >> What I meant was checking the entry code to see if it aligns stack >> >> frames, and good luck getting sparse to do that. Hmm, getting 16-byte >> >> alignment for real may actually be entirely a lost cause. After all, >> >> I think we have some inline functions that do asm volatile ("call >> >> ..."), and I don't see any credible way of forcing alignment short of >> >> generating an entirely new stack frame and aligning that. >> > >> > Actually we already found all such cases and fixed them by forcing a new >> > stack frame, thanks to objtool. For example, see 55a76b59b5fe. >> >> What I mean is: what guarantees that the stack is properly aligned for >> the subroutine call? gcc promises to set up a stack frame, but does >> it promise that rsp will be properly aligned to call a C function? > > Yes, I did an experiment and you're right. I had naively assumed that > all stack frames would be aligned. Just to check: did you do your experiment with -mpreferred-stack-boundary=4? --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 5/6] crypto: aesni-intel - Add bulk request support
On Thu, Jan 12, 2017 at 01:59:57PM +0100, Ondrej Mosnacek wrote: > This patch implements bulk request handling in the AES-NI crypto drivers. > The major advantage of this is that with bulk requests, the kernel_fpu_* > functions (which are usually quite slow) are now called only once for the > whole > request. > Hi Ondrej, To what extent does the performance benefit of this patchset result from just the reduced numbers of calls to kernel_fpu_begin() and kernel_fpu_end()? If it's most of the benefit, would it make any sense to optimize kernel_fpu_begin() and kernel_fpu_end() instead? And if there are other examples besides kernel_fpu_begin/kernel_fpu_end where the bulk API would provide a significant performance boost, can you mention them? Interestingly, the arm64 equivalent to kernel_fpu_begin() (kernel_neon_begin_partial() in arch/arm64/kernel/fpsimd.c) appears to have an optimization where the SIMD registers aren't saved if they were already saved. I wonder why something similar isn't done on x86. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: x86-64: Maintain 16-byte stack alignment
On Thu, Jan 12, 2017 at 05:46:55PM -0800, Andy Lutomirski wrote: > On Thu, Jan 12, 2017 at 12:15 PM, Josh Poimboeuf wrote: > > On Thu, Jan 12, 2017 at 12:08:07PM -0800, Andy Lutomirski wrote: > >> On Thu, Jan 12, 2017 at 11:51 AM, Linus Torvalds > >> wrote: > >> > On Thu, Jan 12, 2017 at 6:02 AM, Josh Poimboeuf > >> > wrote: > >> >> > >> >> Just to clarify, I think you're asking if, for versions of gcc which > >> >> don't support -mpreferred-stack-boundary=3, objtool can analyze all C > >> >> functions to ensure their stacks are 16-byte aligned. > >> >> > >> >> It's certainly possible, but I don't see how that solves the problem. > >> >> The stack will still be misaligned by entry code. Or am I missing > >> >> something? > >> > > >> > I think the argument is that we *could* try to align things, if we > >> > just had some tool that actually then verified that we aren't missing > >> > anything. > >> > > >> > I'm not entirely happy with checking the generated code, though, > >> > because as Ingo says, you have a 50:50 chance of just getting it right > >> > by mistake. So I'd much rather have some static tool that checks > >> > things at a code level (ie coccinelle or sparse). > >> > >> What I meant was checking the entry code to see if it aligns stack > >> frames, and good luck getting sparse to do that. Hmm, getting 16-byte > >> alignment for real may actually be entirely a lost cause. After all, > >> I think we have some inline functions that do asm volatile ("call > >> ..."), and I don't see any credible way of forcing alignment short of > >> generating an entirely new stack frame and aligning that. > > > > Actually we already found all such cases and fixed them by forcing a new > > stack frame, thanks to objtool. For example, see 55a76b59b5fe. > > What I mean is: what guarantees that the stack is properly aligned for > the subroutine call? gcc promises to set up a stack frame, but does > it promise that rsp will be properly aligned to call a C function? Yes, I did an experiment and you're right. I had naively assumed that all stack frames would be aligned. -- Josh -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v8 1/1] crypto: add virtio-crypto driver
> > On Thu, Jan 12, 2017 at 03:10:25PM +0100, Christian Borntraeger wrote: > > On 01/10/2017 01:56 PM, Christian Borntraeger wrote: > > > On 01/10/2017 01:36 PM, Gonglei (Arei) wrote: > > >> Hi, > > >> > > >>> > > >>> On 12/15/2016 03:03 AM, Gonglei wrote: > > >>> [...] > > + > > +static struct crypto_alg virtio_crypto_algs[] = { { > > + .cra_name = "cbc(aes)", > > + .cra_driver_name = "virtio_crypto_aes_cbc", > > + .cra_priority = 501, > > >>> > > >>> > > >>> This is still higher than the hardware-accelerators (like intel aesni > > >>> or the > > >>> s390 cpacf functions or the arm hw). aesni and s390/cpacf are supported > by the > > >>> hardware virtualization and available to the guests. I do not see a way > how > > >>> virtio > > >>> crypto can be faster than that (in the end it might be cpacf/aesni + > overhead) > > >>> instead it will very likely be slower. > > >>> So we should use a number that is higher than software implementations > but > > >>> lower than the hw ones. > > >>> > > >>> Just grepping around, the software ones seem be be around 100 and the > > >>> hardware > > >>> ones around 200-400. So why was 150 not enough? > > >>> > > >> I didn't find a documentation about how we use the priority, and I > > >> assumed > > >> people use virtio-crypto will configure hardware accelerators in the > > >> host. So I choosed the number which bigger than aesni's priority. > > > > > > Yes, but the aesni driver will only bind if there is HW support in the > > > guest. > > > And if aesni is available in the guest (or the s390 aes function from > > > cpacf) > > > it will always be faster than the same in the host via virtio.So your > > > priority > > > should be smaller. > > > > > > any opinion on this? > > Going forward, we might add an emulated aesni device and that might > become slower than virtio. OTOH if or when this happens, we can solve it > by adding a priority or a feature flag to virtio to raise its priority. > > So I think I agree with Christian here, let's lower the priority. > Gonglei, could you send a patch like this? > OK, will do. Thanks, -Gonglei -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: x86-64: Maintain 16-byte stack alignment
On Thu, Jan 12, 2017 at 12:15 PM, Josh Poimboeuf wrote: > On Thu, Jan 12, 2017 at 12:08:07PM -0800, Andy Lutomirski wrote: >> On Thu, Jan 12, 2017 at 11:51 AM, Linus Torvalds >> wrote: >> > On Thu, Jan 12, 2017 at 6:02 AM, Josh Poimboeuf >> > wrote: >> >> >> >> Just to clarify, I think you're asking if, for versions of gcc which >> >> don't support -mpreferred-stack-boundary=3, objtool can analyze all C >> >> functions to ensure their stacks are 16-byte aligned. >> >> >> >> It's certainly possible, but I don't see how that solves the problem. >> >> The stack will still be misaligned by entry code. Or am I missing >> >> something? >> > >> > I think the argument is that we *could* try to align things, if we >> > just had some tool that actually then verified that we aren't missing >> > anything. >> > >> > I'm not entirely happy with checking the generated code, though, >> > because as Ingo says, you have a 50:50 chance of just getting it right >> > by mistake. So I'd much rather have some static tool that checks >> > things at a code level (ie coccinelle or sparse). >> >> What I meant was checking the entry code to see if it aligns stack >> frames, and good luck getting sparse to do that. Hmm, getting 16-byte >> alignment for real may actually be entirely a lost cause. After all, >> I think we have some inline functions that do asm volatile ("call >> ..."), and I don't see any credible way of forcing alignment short of >> generating an entirely new stack frame and aligning that. > > Actually we already found all such cases and fixed them by forcing a new > stack frame, thanks to objtool. For example, see 55a76b59b5fe. What I mean is: what guarantees that the stack is properly aligned for the subroutine call? gcc promises to set up a stack frame, but does it promise that rsp will be properly aligned to call a C function? -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[cryptodev:master 43/44] arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr tt,=crypto_ft_tab'
tree: https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git master head: 1abee99eafab67fb1c98f9ecfc43cd5735384a86 commit: 81edb42629758bacdf813dd5e4542ae26e3ad73a [43/44] crypto: arm/aes - replace scalar AES cipher config: arm-allmodconfig (attached as .config) compiler: arm-linux-gnueabi-gcc (Debian 6.1.1-9) 6.1.1 20160705 reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross git checkout 81edb42629758bacdf813dd5e4542ae26e3ad73a # save the attached .config to linux build tree make.cross ARCH=arm All errors (new ones prefixed by >>): arch/arm/crypto/aes-cipher-core.S: Assembler messages: arch/arm/crypto/aes-cipher-core.S:21: Error: selected processor does not support `tt .req ip' in ARM mode >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> tt,=crypto_ft_tab' arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r8,[tt,r8,lsl#2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> t0,[tt,t0,lsr#(8*1)-2]' arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r9,[tt,r9,lsl#2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> t1,[tt,t1,lsr#(8*1)-2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> t2,[tt,t2,lsr#(8*2)-2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> r10,[tt,r10,lsr#(8*2)-2]' arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t0,[tt,t0,lsr#(8*3)-2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> r11,[tt,r11,lsr#(8*3)-2]' arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r10,[tt,r10,lsl#2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> t0,[tt,t0,lsr#(8*1)-2]' arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r11,[tt,r11,lsl#2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> t1,[tt,t1,lsr#(8*1)-2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> t2,[tt,t2,lsr#(8*2)-2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> r5,[tt,r5,lsr#(8*2)-2]' arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t0,[tt,t0,lsr#(8*3)-2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> r6,[tt,r6,lsr#(8*3)-2]' arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r4,[tt,r4,lsl#2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> t0,[tt,t0,lsr#(8*1)-2]' arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r5,[tt,r5,lsl#2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> t1,[tt,t1,lsr#(8*1)-2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> t2,[tt,t2,lsr#(8*2)-2]' arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r6,[tt,r6,lsr#(8*2)-2]' arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t0,[tt,t0,lsr#(8*3)-2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> r7,[tt,r7,lsr#(8*3)-2]' arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r6,[tt,r6,lsl#2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> t0,[tt,t0,lsr#(8*1)-2]' arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r7,[tt,r7,lsl#2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> t1,[tt,t1,lsr#(8*1)-2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> t2,[tt,t2,lsr#(8*2)-2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> r9,[tt,r9,lsr#(8*2)-2]' arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr t0,[tt,t0,lsr#(8*3)-2]' arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r10,[tt,r10,lsr#(8*3)-2]' arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r8,[tt,r8,lsl#2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> t0,[tt,t0,lsr#(8*1)-2]' arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr r9,[tt,r9,lsl#2]' vim +174 arch/arm/crypto/aes-cipher-core.S 15 .align 5 16 17 rk .reqr0 18 rounds .reqr1 19 in .reqr2 20 out .reqr3 > 21 tt .reqip 22 23 t0 .reqlr 24 t1
Re: x86-64: Maintain 16-byte stack alignment
On Thu, Jan 12, 2017 at 12:55 PM, Josh Poimboeuf wrote: > > - Who's going to run sparse all the time to catch unauthorized users of > __aligned__(16)? Well, considering that we apparently only have a small handful of existing users without anybody having ever run any tool at all, I don't think this is necessarily a huge problem. One of the build servers could easily add the "make C=2" case to a build test, and just grep the error reports for the 'excessive alignment' string. The zero-day build bot already does much fancier things. So I don't think it would necessarily be all that hard to get a clean build, and just say "if you need aligned stack space, you have to do it yourself by hand". That saId, if we now always enable frame pointers on x86 (and it has gotten more and more difficult to avoid it), then the 16-byte alignment would fairly natural. The 8-byte alignment mainly makes sense when the basic call sequence just adds 8 bytes, and you have functions without frames (that still call other functions). Linus -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: x86-64: Maintain 16-byte stack alignment
On Thu, Jan 12, 2017 at 02:15:11PM -0600, Josh Poimboeuf wrote: > On Thu, Jan 12, 2017 at 12:08:07PM -0800, Andy Lutomirski wrote: > > On Thu, Jan 12, 2017 at 11:51 AM, Linus Torvalds > > wrote: > > > On Thu, Jan 12, 2017 at 6:02 AM, Josh Poimboeuf > > > wrote: > > >> > > >> Just to clarify, I think you're asking if, for versions of gcc which > > >> don't support -mpreferred-stack-boundary=3, objtool can analyze all C > > >> functions to ensure their stacks are 16-byte aligned. > > >> > > >> It's certainly possible, but I don't see how that solves the problem. > > >> The stack will still be misaligned by entry code. Or am I missing > > >> something? > > > > > > I think the argument is that we *could* try to align things, if we > > > just had some tool that actually then verified that we aren't missing > > > anything. > > > > > > I'm not entirely happy with checking the generated code, though, > > > because as Ingo says, you have a 50:50 chance of just getting it right > > > by mistake. So I'd much rather have some static tool that checks > > > things at a code level (ie coccinelle or sparse). > > > > What I meant was checking the entry code to see if it aligns stack > > frames, and good luck getting sparse to do that. Hmm, getting 16-byte > > alignment for real may actually be entirely a lost cause. After all, > > I think we have some inline functions that do asm volatile ("call > > ..."), and I don't see any credible way of forcing alignment short of > > generating an entirely new stack frame and aligning that. > > Actually we already found all such cases and fixed them by forcing a new > stack frame, thanks to objtool. For example, see 55a76b59b5fe. > > > Ick. This > > whole situation stinks, and I wish that the gcc developers had been > > less daft here in the first place or that we'd noticed and gotten it > > fixed much longer ago. > > > > Can we come up with a macro like STACK_ALIGN_16 that turns into > > __aligned__(32) on bad gcc versions and combine that with your sparse > > patch? This could work. Only concerns I'd have are: - Are there (or will there be in the future) any asm functions which assume a 16-byte aligned stack? (Seems unlikely. Stack alignment is common in the crypto code but they do the alignment manually.) - Who's going to run sparse all the time to catch unauthorized users of __aligned__(16)? -- Josh -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [cryptodev:master 43/44] arch/arm/crypto/aes-cipher-core.S:21: Error: selected processor does not support `tt .req ip' in ARM mode
Hi Arnd, On 12 January 2017 at 19:04, kbuild test robot wrote: > tree: > https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git > master > head: 1abee99eafab67fb1c98f9ecfc43cd5735384a86 > commit: 81edb42629758bacdf813dd5e4542ae26e3ad73a [43/44] crypto: arm/aes - > replace scalar AES cipher > config: arm-multi_v7_defconfig (attached as .config) > compiler: arm-linux-gnueabi-gcc (Debian 6.1.1-9) 6.1.1 20160705 > reproduce: > wget > https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross > -O ~/bin/make.cross > chmod +x ~/bin/make.cross > git checkout 81edb42629758bacdf813dd5e4542ae26e3ad73a > # save the attached .config to linux build tree > make.cross ARCH=arm > > All errors (new ones prefixed by >>): > >arch/arm/crypto/aes-cipher-core.S: Assembler messages: >>> arch/arm/crypto/aes-cipher-core.S:21: Error: selected processor does not >>> support `tt .req ip' in ARM mode Did you ever see this error? This is very odd: .req simply declares an alias for a register name, and this works fine locally >>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- >>> `movw tt,#:lower16:crypto_ft_tab' >>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- >>> `movt tt,#:upper16:crypto_ft_tab' >>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >>> r8,[tt,r8,lsl#2]' >>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >>> t0,[tt,t0,lsl#2]' >>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >>> r9,[tt,r9,lsl#2]' >>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >>> t1,[tt,t1,lsl#2]' >>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >>> t2,[tt,t2,lsl#2]' >>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >>> r10,[tt,r10,lsl#2]' >>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >>> t0,[tt,t0,lsl#2]' >>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >>> r11,[tt,r11,lsl#2]' >>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >>> r10,[tt,r10,lsl#2]' >>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >>> t0,[tt,t0,lsl#2]' >>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >>> r11,[tt,r11,lsl#2]' >>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >>> t1,[tt,t1,lsl#2]' >>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >>> t2,[tt,t2,lsl#2]' >>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >>> r5,[tt,r5,lsl#2]' >>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >>> t0,[tt,t0,lsl#2]' >>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >>> r6,[tt,r6,lsl#2]' >>> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >>> r4,[tt,r4,lsl#2]' > > vim +21 arch/arm/crypto/aes-cipher-core.S > > 15 .align 5 > 16 > 17 rk .reqr0 > 18 rounds .reqr1 > 19 in .reqr2 > 20 out .reqr3 > > 21 tt .reqip > 22 > 23 t0 .reqlr > 24 t1 .reqr2 > 25 t2 .reqr3 > 26 > 27 .macro __select, out, in, idx > 28 .if __LINUX_ARM_ARCH__ < 7 > 29 and \out, \in, #0xff << (8 * \idx) > 30 .else > 31 ubfx\out, \in, #(8 * \idx), #8 > 32 .endif > 33 .endm > 34 > 35 .macro __load, out, in, idx > 36 .if __LINUX_ARM_ARCH__ < 7 && \idx > 0 > 37 ldr \out, [tt, \in, lsr #(8 * \idx) - 2] > 38 .else > 39 ldr \out, [tt, \in, lsl #2] > 40 .endif > 41 .endm > 42 > 43 .macro __hround, out0, out1, in0, in1, in2, in3, t3, > t4, enc > 44 __select\out0, \in0, 0 > 45 __selectt0, \in1, 1 > 46 __load \out0, \out0, 0 > 47 __load t0, t0, 1 > 48 > 49 .if \enc > 50 __select\out1, \in1, 0 > 51 __selectt1, \in2, 1 > 52 .else > 53 __select\out1, \in3, 0 > 54 __selectt1, \in0, 1 > 55 .endif > 56 __load \out1, \out1, 0 > 57 __selectt2, \in2, 2 > 58 __load t1, t1, 1 > 59 __load t2, t2, 2 > 60 > 61
Re: x86-64: Maintain 16-byte stack alignment
On Thu, Jan 12, 2017 at 12:08:07PM -0800, Andy Lutomirski wrote: > On Thu, Jan 12, 2017 at 11:51 AM, Linus Torvalds > wrote: > > On Thu, Jan 12, 2017 at 6:02 AM, Josh Poimboeuf wrote: > >> > >> Just to clarify, I think you're asking if, for versions of gcc which > >> don't support -mpreferred-stack-boundary=3, objtool can analyze all C > >> functions to ensure their stacks are 16-byte aligned. > >> > >> It's certainly possible, but I don't see how that solves the problem. > >> The stack will still be misaligned by entry code. Or am I missing > >> something? > > > > I think the argument is that we *could* try to align things, if we > > just had some tool that actually then verified that we aren't missing > > anything. > > > > I'm not entirely happy with checking the generated code, though, > > because as Ingo says, you have a 50:50 chance of just getting it right > > by mistake. So I'd much rather have some static tool that checks > > things at a code level (ie coccinelle or sparse). > > What I meant was checking the entry code to see if it aligns stack > frames, and good luck getting sparse to do that. Hmm, getting 16-byte > alignment for real may actually be entirely a lost cause. After all, > I think we have some inline functions that do asm volatile ("call > ..."), and I don't see any credible way of forcing alignment short of > generating an entirely new stack frame and aligning that. Actually we already found all such cases and fixed them by forcing a new stack frame, thanks to objtool. For example, see 55a76b59b5fe. > Ick. This > whole situation stinks, and I wish that the gcc developers had been > less daft here in the first place or that we'd noticed and gotten it > fixed much longer ago. > > Can we come up with a macro like STACK_ALIGN_16 that turns into > __aligned__(32) on bad gcc versions and combine that with your sparse > patch? -- Josh -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: x86-64: Maintain 16-byte stack alignment
On Thu, Jan 12, 2017 at 11:51 AM, Linus Torvalds wrote: > On Thu, Jan 12, 2017 at 6:02 AM, Josh Poimboeuf wrote: >> >> Just to clarify, I think you're asking if, for versions of gcc which >> don't support -mpreferred-stack-boundary=3, objtool can analyze all C >> functions to ensure their stacks are 16-byte aligned. >> >> It's certainly possible, but I don't see how that solves the problem. >> The stack will still be misaligned by entry code. Or am I missing >> something? > > I think the argument is that we *could* try to align things, if we > just had some tool that actually then verified that we aren't missing > anything. > > I'm not entirely happy with checking the generated code, though, > because as Ingo says, you have a 50:50 chance of just getting it right > by mistake. So I'd much rather have some static tool that checks > things at a code level (ie coccinelle or sparse). What I meant was checking the entry code to see if it aligns stack frames, and good luck getting sparse to do that. Hmm, getting 16-byte alignment for real may actually be entirely a lost cause. After all, I think we have some inline functions that do asm volatile ("call ..."), and I don't see any credible way of forcing alignment short of generating an entirely new stack frame and aligning that. Ick. This whole situation stinks, and I wish that the gcc developers had been less daft here in the first place or that we'd noticed and gotten it fixed much longer ago. Can we come up with a macro like STACK_ALIGN_16 that turns into __aligned__(32) on bad gcc versions and combine that with your sparse patch? -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: x86-64: Maintain 16-byte stack alignment
On Thu, Jan 12, 2017 at 6:02 AM, Josh Poimboeuf wrote: > > Just to clarify, I think you're asking if, for versions of gcc which > don't support -mpreferred-stack-boundary=3, objtool can analyze all C > functions to ensure their stacks are 16-byte aligned. > > It's certainly possible, but I don't see how that solves the problem. > The stack will still be misaligned by entry code. Or am I missing > something? I think the argument is that we *could* try to align things, if we just had some tool that actually then verified that we aren't missing anything. I'm not entirely happy with checking the generated code, though, because as Ingo says, you have a 50:50 chance of just getting it right by mistake. So I'd much rather have some static tool that checks things at a code level (ie coccinelle or sparse). Almost totally untested "sparse" patch appended. The problem with sparse, obviously, is that few enough people run it, and it gives a lot of other warnings. But maybe Herbert can test whether this would actually have caught his situation, doing something like an allmodconfig build with "C=2" to force a sparse run on everything, and redirecting the warnings to stderr. But this patch does seem to give a warning for the patch that Herbert had, and that caused problems. And in fact it seems to find a few other possible problems (most, but not all, in crypto). This run was with the broken chacha20 patch applied, to verify that I get a warning for that case: arch/x86/crypto/chacha20_glue.c:70:13: warning: symbol 'state' has excessive alignment (16) arch/x86/crypto/aesni-intel_glue.c:724:12: warning: symbol 'iv' has excessive alignment (16) arch/x86/crypto/aesni-intel_glue.c:803:12: warning: symbol 'iv' has excessive alignment (16) crypto/shash.c:82:12: warning: symbol 'ubuf' has excessive alignment (16) crypto/shash.c:118:12: warning: symbol 'ubuf' has excessive alignment (16) drivers/char/hw_random/via-rng.c:89:14: warning: symbol 'buf' has excessive alignment (16) net/bridge/netfilter/ebtables.c:1809:31: warning: symbol 'tinfo' has excessive alignment (64) drivers/crypto/padlock-sha.c:85:14: warning: symbol 'buf' has excessive alignment (16) drivers/crypto/padlock-sha.c:147:14: warning: symbol 'buf' has excessive alignment (16) drivers/crypto/padlock-sha.c:304:12: warning: symbol 'buf' has excessive alignment (16) drivers/crypto/padlock-sha.c:388:12: warning: symbol 'buf' has excessive alignment (16) net/openvswitch/actions.c:797:33: warning: symbol 'ovs_rt' has excessive alignment (64) drivers/net/ethernet/neterion/vxge/vxge-config.c:1006:38: warning: symbol 'vpath' has excessive alignment (64) although I think at least some of these happen to be ok. There are a few places that clearly don't care about exact alignment, and use "__attribute__((aligned))" without any specific alignment value. It's just sparse that thinks that implies 16-byte alignment (it doesn't, really - it's unspecified, and is telling gcc to use "maximum useful alignment", so who knows _what_ gcc will assume). But some of them may well be real issues - if the alignment is about correctness rather than anything else. Anyway, the advantage of this kind of source-level check is that it should really catch things regardless of "luck" wrt alignment. Linus flow.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/flow.c b/flow.c index 7db9548..c876869 100644 --- a/flow.c +++ b/flow.c @@ -601,6 +601,20 @@ static void simplify_one_symbol(struct entrypoint *ep, struct symbol *sym) unsigned long mod; int all, stores, complex; + /* +* Warn about excessive local variable alignment. +* +* This needs to be linked up with some flag to enable +* it, and specify the alignment. The 'max_int_alignment' +* just happens to be what we want for the kernel for x86-64. +*/ + mod = sym->ctype.modifiers; + if (!(mod & (MOD_NONLOCAL | MOD_STATIC))) { + unsigned int alignment = sym->ctype.alignment; + if (alignment > max_int_alignment) + warning(sym->pos, "symbol '%s' has excessive alignment (%u)", show_ident(sym->ident), alignment); + } + /* Never used as a symbol? */ pseudo = sym->pseudo; if (!pseudo)
[cryptodev:master 43/44] arch/arm/crypto/aes-cipher-core.S:21: Error: selected processor does not support `tt .req ip' in ARM mode
tree: https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git master head: 1abee99eafab67fb1c98f9ecfc43cd5735384a86 commit: 81edb42629758bacdf813dd5e4542ae26e3ad73a [43/44] crypto: arm/aes - replace scalar AES cipher config: arm-multi_v7_defconfig (attached as .config) compiler: arm-linux-gnueabi-gcc (Debian 6.1.1-9) 6.1.1 20160705 reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross git checkout 81edb42629758bacdf813dd5e4542ae26e3ad73a # save the attached .config to linux build tree make.cross ARCH=arm All errors (new ones prefixed by >>): arch/arm/crypto/aes-cipher-core.S: Assembler messages: >> arch/arm/crypto/aes-cipher-core.S:21: Error: selected processor does not >> support `tt .req ip' in ARM mode >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `movw >> tt,#:lower16:crypto_ft_tab' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `movt >> tt,#:upper16:crypto_ft_tab' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> r8,[tt,r8,lsl#2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> t0,[tt,t0,lsl#2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> r9,[tt,r9,lsl#2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> t1,[tt,t1,lsl#2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> t2,[tt,t2,lsl#2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> r10,[tt,r10,lsl#2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> t0,[tt,t0,lsl#2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> r11,[tt,r11,lsl#2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> r10,[tt,r10,lsl#2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> t0,[tt,t0,lsl#2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> r11,[tt,r11,lsl#2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> t1,[tt,t1,lsl#2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> t2,[tt,t2,lsl#2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> r5,[tt,r5,lsl#2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> t0,[tt,t0,lsl#2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> r6,[tt,r6,lsl#2]' >> arch/arm/crypto/aes-cipher-core.S:174: Error: ARM register expected -- `ldr >> r4,[tt,r4,lsl#2]' vim +21 arch/arm/crypto/aes-cipher-core.S 15 .align 5 16 17 rk .reqr0 18 rounds .reqr1 19 in .reqr2 20 out .reqr3 > 21 tt .reqip 22 23 t0 .reqlr 24 t1 .reqr2 25 t2 .reqr3 26 27 .macro __select, out, in, idx 28 .if __LINUX_ARM_ARCH__ < 7 29 and \out, \in, #0xff << (8 * \idx) 30 .else 31 ubfx\out, \in, #(8 * \idx), #8 32 .endif 33 .endm 34 35 .macro __load, out, in, idx 36 .if __LINUX_ARM_ARCH__ < 7 && \idx > 0 37 ldr \out, [tt, \in, lsr #(8 * \idx) - 2] 38 .else 39 ldr \out, [tt, \in, lsl #2] 40 .endif 41 .endm 42 43 .macro __hround, out0, out1, in0, in1, in2, in3, t3, t4, enc 44 __select\out0, \in0, 0 45 __selectt0, \in1, 1 46 __load \out0, \out0, 0 47 __load t0, t0, 1 48 49 .if \enc 50 __select\out1, \in1, 0 51 __selectt1, \in2, 1 52 .else 53 __select\out1, \in3, 0 54 __selectt1, \in0, 1 55 .endif 56 __load \out1, \out1, 0 57 __selectt2, \in2, 2 58 __load t1, t1, 1 59 __load t2, t2, 2 60 61 eor \out0, \out0, t0, ror #24 62 63 __selectt0, \in3, 3 64 .if \enc 65 __select\t3, \in3, 2 66 __select\t4, \in0, 3 67 .else 68 __select\t3, \in1, 2 69 __select\t4, \in2, 3 70
[PATCH 0/4] n2rng: add support for m5/m7 rng register layout
Commit c1e9b3b0eea1 ("hwrng: n2 - Attach on T5/M5, T7/M7 SPARC CPUs") added config strings to enable the random number generator in the sparc m5 and m7 platforms. This worked fine for client LDoms, but not for the primary LDom, or running on bare metal, because the actual rng hardware layout changed and self-test would now fail, continually spewing error messages on the console. This patch series adds correct support for the new rng register layout, and adds a limiter to the spewing of error messages. Orabug: 25127795 Shannon Nelson (4): n2rng: limit error spewage when self-test fails n2rng: add device data descriptions n2rng: support new hardware register layout n2rng: update version info drivers/char/hw_random/n2-drv.c | 204 +-- drivers/char/hw_random/n2rng.h | 51 -- 2 files changed, 196 insertions(+), 59 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/4] n2rng: update version info
Signed-off-by: Shannon Nelson --- drivers/char/hw_random/n2-drv.c |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/char/hw_random/n2-drv.c b/drivers/char/hw_random/n2-drv.c index f0bd5ee..31cbdbb 100644 --- a/drivers/char/hw_random/n2-drv.c +++ b/drivers/char/hw_random/n2-drv.c @@ -21,11 +21,11 @@ #define DRV_MODULE_NAME"n2rng" #define PFX DRV_MODULE_NAME": " -#define DRV_MODULE_VERSION "0.2" -#define DRV_MODULE_RELDATE "July 27, 2011" +#define DRV_MODULE_VERSION "0.3" +#define DRV_MODULE_RELDATE "Jan 7, 2017" static char version[] = - DRV_MODULE_NAME ".c:v" DRV_MODULE_VERSION " (" DRV_MODULE_RELDATE ")\n"; + DRV_MODULE_NAME " v" DRV_MODULE_VERSION " (" DRV_MODULE_RELDATE ")\n"; MODULE_AUTHOR("David S. Miller (da...@davemloft.net)"); MODULE_DESCRIPTION("Niagara2 RNG driver"); @@ -765,7 +765,7 @@ static int n2rng_probe(struct platform_device *op) "multi-unit-capable" : "single-unit"), np->num_units); - np->hwrng.name = "n2rng"; + np->hwrng.name = DRV_MODULE_NAME; np->hwrng.data_read = n2rng_data_read; np->hwrng.priv = (unsigned long) np; -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/4] n2rng: limit error spewage when self-test fails
If the self-test fails, it probably won't actually suddenly start working. Currently, this causes an endless spew of error messages on the console and in the logs, so this patch adds a limiter to the test. Reported-by: Sowmini Varadhan Signed-off-by: Shannon Nelson --- drivers/char/hw_random/n2-drv.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/drivers/char/hw_random/n2-drv.c b/drivers/char/hw_random/n2-drv.c index 3b06c1d..102560f 100644 --- a/drivers/char/hw_random/n2-drv.c +++ b/drivers/char/hw_random/n2-drv.c @@ -589,6 +589,7 @@ static void n2rng_work(struct work_struct *work) { struct n2rng *np = container_of(work, struct n2rng, work.work); int err = 0; + static int retries = 4; if (!(np->flags & N2RNG_FLAG_CONTROL)) { err = n2rng_guest_check(np); @@ -606,7 +607,9 @@ static void n2rng_work(struct work_struct *work) dev_info(&np->op->dev, "RNG ready\n"); } - if (err && !(np->flags & N2RNG_FLAG_SHUTDOWN)) + if (--retries == 0) + dev_err(&np->op->dev, "Self-test retries failed, RNG not ready\n"); + else if (err && !(np->flags & N2RNG_FLAG_SHUTDOWN)) schedule_delayed_work(&np->work, HZ * 2); } -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/4] n2rng: add device data descriptions
Since we're going to need to keep track of more than just one attribute of the hardware, we'll change the use of the data field from the match struct from a single flag to a struct pointer. This patch adds the struct template and initial descriptions. Signed-off-by: Shannon Nelson --- drivers/char/hw_random/n2-drv.c | 47 -- drivers/char/hw_random/n2rng.h | 15 2 files changed, 54 insertions(+), 8 deletions(-) diff --git a/drivers/char/hw_random/n2-drv.c b/drivers/char/hw_random/n2-drv.c index 102560f..74c26c7 100644 --- a/drivers/char/hw_random/n2-drv.c +++ b/drivers/char/hw_random/n2-drv.c @@ -625,24 +625,23 @@ static void n2rng_driver_version(void) static int n2rng_probe(struct platform_device *op) { const struct of_device_id *match; - int multi_capable; int err = -ENOMEM; struct n2rng *np; match = of_match_device(n2rng_match, &op->dev); if (!match) return -EINVAL; - multi_capable = (match->data != NULL); n2rng_driver_version(); np = devm_kzalloc(&op->dev, sizeof(*np), GFP_KERNEL); if (!np) goto out; np->op = op; + np->data = (struct n2rng_template *)match->data; INIT_DELAYED_WORK(&np->work, n2rng_work); - if (multi_capable) + if (np->data->multi_capable) np->flags |= N2RNG_FLAG_MULTI; err = -ENODEV; @@ -673,8 +672,9 @@ static int n2rng_probe(struct platform_device *op) dev_err(&op->dev, "VF RNG lacks rng-#units property\n"); goto out_hvapi_unregister; } - } else + } else { np->num_units = 1; + } dev_info(&op->dev, "Registered RNG HVAPI major %lu minor %lu\n", np->hvapi_major, np->hvapi_minor); @@ -731,30 +731,61 @@ static int n2rng_remove(struct platform_device *op) return 0; } +static struct n2rng_template n2_template = { + .id = N2_n2_rng, + .multi_capable = 0, + .chip_version = 1, +}; + +static struct n2rng_template vf_template = { + .id = N2_vf_rng, + .multi_capable = 1, + .chip_version = 1, +}; + +static struct n2rng_template kt_template = { + .id = N2_kt_rng, + .multi_capable = 1, + .chip_version = 1, +}; + +static struct n2rng_template m4_template = { + .id = N2_m4_rng, + .multi_capable = 1, + .chip_version = 2, +}; + +static struct n2rng_template m7_template = { + .id = N2_m7_rng, + .multi_capable = 1, + .chip_version = 2, +}; + static const struct of_device_id n2rng_match[] = { { .name = "random-number-generator", .compatible = "SUNW,n2-rng", + .data = &n2_template, }, { .name = "random-number-generator", .compatible = "SUNW,vf-rng", - .data = (void *) 1, + .data = &vf_template, }, { .name = "random-number-generator", .compatible = "SUNW,kt-rng", - .data = (void *) 1, + .data = &kt_template, }, { .name = "random-number-generator", .compatible = "ORCL,m4-rng", - .data = (void *) 1, + .data = &m4_template, }, { .name = "random-number-generator", .compatible = "ORCL,m7-rng", - .data = (void *) 1, + .data = &m7_template, }, {}, }; diff --git a/drivers/char/hw_random/n2rng.h b/drivers/char/hw_random/n2rng.h index f244ac8..e41e55a 100644 --- a/drivers/char/hw_random/n2rng.h +++ b/drivers/char/hw_random/n2rng.h @@ -60,6 +60,20 @@ extern unsigned long sun4v_rng_data_read_diag_v2(unsigned long data_ra, extern unsigned long sun4v_rng_data_read(unsigned long data_ra, unsigned long *tick_delta); +enum n2rng_compat_id { + N2_n2_rng, + N2_vf_rng, + N2_kt_rng, + N2_m4_rng, + N2_m7_rng, +}; + +struct n2rng_template { + enum n2rng_compat_id id; + int multi_capable; + int chip_version; +}; + struct n2rng_unit { u64 control[HV_RNG_NUM_CONTROL]; }; @@ -74,6 +88,7 @@ struct n2rng { #define N2RNG_FLAG_SHUTDOWN0x0010 /* Driver unregistering*/ #define N2RNG_FLAG_BUFFER_VALID0x0020 /* u32 buffer holds valid data */ + struct n2rng_template *data; int num_units; struct n2rng_unit *units; -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordo
[PATCH 3/4] n2rng: support new hardware register layout
Add the new register layout constants and the requisite logic for using them. Signed-off-by: Shannon Nelson --- drivers/char/hw_random/n2-drv.c | 144 +-- drivers/char/hw_random/n2rng.h | 36 +++--- 2 files changed, 134 insertions(+), 46 deletions(-) diff --git a/drivers/char/hw_random/n2-drv.c b/drivers/char/hw_random/n2-drv.c index 74c26c7..f0bd5ee 100644 --- a/drivers/char/hw_random/n2-drv.c +++ b/drivers/char/hw_random/n2-drv.c @@ -302,26 +302,57 @@ static int n2rng_try_read_ctl(struct n2rng *np) return n2rng_hv_err_trans(hv_err); } -#define CONTROL_DEFAULT_BASE \ - ((2 << RNG_CTL_ASEL_SHIFT) |\ -(N2RNG_ACCUM_CYCLES_DEFAULT << RNG_CTL_WAIT_SHIFT) | \ -RNG_CTL_LFSR) - -#define CONTROL_DEFAULT_0 \ - (CONTROL_DEFAULT_BASE | \ -(1 << RNG_CTL_VCO_SHIFT) | \ -RNG_CTL_ES1) -#define CONTROL_DEFAULT_1 \ - (CONTROL_DEFAULT_BASE | \ -(2 << RNG_CTL_VCO_SHIFT) | \ -RNG_CTL_ES2) -#define CONTROL_DEFAULT_2 \ - (CONTROL_DEFAULT_BASE | \ -(3 << RNG_CTL_VCO_SHIFT) | \ -RNG_CTL_ES3) -#define CONTROL_DEFAULT_3 \ - (CONTROL_DEFAULT_BASE | \ -RNG_CTL_ES1 | RNG_CTL_ES2 | RNG_CTL_ES3) +static u64 n2rng_control_default(struct n2rng *np, int ctl) +{ + u64 val = 0; + + if (np->data->chip_version == 1) { + val = ((2 << RNG_v1_CTL_ASEL_SHIFT) | + (N2RNG_ACCUM_CYCLES_DEFAULT << RNG_v1_CTL_WAIT_SHIFT) | +RNG_CTL_LFSR); + + switch (ctl) { + case 0: + val |= (1 << RNG_v1_CTL_VCO_SHIFT) | RNG_CTL_ES1; + break; + case 1: + val |= (2 << RNG_v1_CTL_VCO_SHIFT) | RNG_CTL_ES2; + break; + case 2: + val |= (3 << RNG_v1_CTL_VCO_SHIFT) | RNG_CTL_ES3; + break; + case 3: + val |= RNG_CTL_ES1 | RNG_CTL_ES2 | RNG_CTL_ES3; + break; + default: + break; + } + + } else { + val = ((2 << RNG_v2_CTL_ASEL_SHIFT) | + (N2RNG_ACCUM_CYCLES_DEFAULT << RNG_v2_CTL_WAIT_SHIFT) | +RNG_CTL_LFSR); + + switch (ctl) { + case 0: + val |= (1 << RNG_v2_CTL_VCO_SHIFT) | RNG_CTL_ES1; + break; + case 1: + val |= (2 << RNG_v2_CTL_VCO_SHIFT) | RNG_CTL_ES2; + break; + case 2: + val |= (3 << RNG_v2_CTL_VCO_SHIFT) | RNG_CTL_ES3; + break; + case 3: + val |= RNG_CTL_ES1 | RNG_CTL_ES2 | RNG_CTL_ES3; + break; + default: + break; + } + } + + return val; +} static void n2rng_control_swstate_init(struct n2rng *np) { @@ -336,10 +367,10 @@ static void n2rng_control_swstate_init(struct n2rng *np) for (i = 0; i < np->num_units; i++) { struct n2rng_unit *up = &np->units[i]; - up->control[0] = CONTROL_DEFAULT_0; - up->control[1] = CONTROL_DEFAULT_1; - up->control[2] = CONTROL_DEFAULT_2; - up->control[3] = CONTROL_DEFAULT_3; + up->control[0] = n2rng_control_default(np, 0); + up->control[1] = n2rng_control_default(np, 1); + up->control[2] = n2rng_control_default(np, 2); + up->control[3] = n2rng_control_default(np, 3); } np->hv_state = HV_RNG_STATE_UNCONFIGURED; @@ -399,6 +430,7 @@ static int n2rng_data_read(struct hwrng *rng, u32 *data) } else { int err = n2rng_generic_read_data(ra); if (!err) { + np->flags |= N2RNG_FLAG_BUFFER_VALID; np->buffer = np->test_data >> 32; *data = np->test_data & 0x; len = 4; @@ -487,9 +519,21 @@ static void n2rng_dump_test_buffer(struct n2rng *np) static int n2rng_check_selftest_buffer(struct n2rng *np, unsigned long unit) { - u64 val = SELFTEST_VAL; + u64 val; int err, matches, limit; + switch (np->data->id) { + case N2_n2_rng: + case N2_vf_rng: + case N2_kt_rng: + case N2_m4_rng: /* yes, m4 uses the old value */ + val = RNG_v1_SELFTEST_VAL; + break; + default: + val = RNG_v2_SELFTEST_VAL; + break; + } + matches = 0; for (limit = 0; limit < SELFTEST_LOOPS_MAX; limit++) { matches += n2rng_test_buffer_find(np, val); @@ -5
Re: [PATCH v1 3/8] crypto:chcr- Fix key length for RFC4106
On Thu, Jan 12, 2017 at 10:08:46PM +0530, Harsh Jain wrote: > > That case is already handled in next if condition.It will error out with > -EINVAL in next condition. > > if (keylen == AES_KEYSIZE_128) { Good point. Please split the patches according to whether they should go into 4.10/4.11 and then resubmit. Thanks, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/7] crypto: ARM/arm64 - AES and ChaCha20 updates for v4.11
On 12 January 2017 at 16:45, Herbert Xu wrote: > On Wed, Jan 11, 2017 at 04:41:48PM +, Ard Biesheuvel wrote: >> This adds ARM and arm64 implementations of ChaCha20, scalar AES and SIMD >> AES (using bit slicing). The SIMD algorithms in this series take advantage >> of the new skcipher walksize attribute to iterate over the input in the most >> efficient manner possible. >> >> Patch #1 adds a NEON implementation of ChaCha20 for ARM. >> >> Patch #2 adds a NEON implementation of ChaCha20 for arm64. >> >> Patch #3 modifies the existing NEON and ARMv8 Crypto Extensions >> implementations >> of AES-CTR to be available as a synchronous skcipher as well. This is >> intended >> for the mac80211 code, which uses synchronous encapsulations of ctr(aes) >> [ccm, gcm] in softirq context, during which arm64 supports use of SIMD code. >> >> Patch #4 adds a scalar implementation of AES for arm64, using the key >> schedule >> generation routines and lookup tables of the generic code in >> crypto/aes_generic. >> >> Patch #5 does the same for ARM, replacing existing scalar code that >> originated >> in the OpenSSL project, and contains redundant key schedule generation >> routines >> and lookup tables (and is slightly slower on modern cores) >> >> Patch #6 replaces the ARM bit sliced NEON code with a new implementation that >> has a number of advantages over the original code (which also originated in >> the >> OpenSSL project.) The performance should be identical. >> >> Patch #7 adds a port of the ARM bit-sliced AES code to arm64, in ECB, CBC, >> CTR >> and XTS modes. >> >> Due to the size of patch #7, it may be difficult to apply these patches from >> patchwork, so I pushed them here as well: > > It seems to have made it. > > All applied. Thanks. Actually, patch #6 was the huge one not #7, and I don't see it in your tree yet. https://git.kernel.org/cgit/linux/kernel/git/ardb/linux.git/commit/?h=crypto-arm-v4.11&id=cbf03b255f7c The order does not matter, though, so could you please put it on top? Thanks. -- Ard. -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] crypto: mediatek - fix format string for 64-bit builds
On Wed, Jan 11, 2017 at 02:55:20PM +0100, Arnd Bergmann wrote: > After I enabled COMPILE_TEST for non-ARM targets, I ran into these > warnings: > > crypto/mediatek/mtk-aes.c: In function 'mtk_aes_info_map': > crypto/mediatek/mtk-aes.c:224:28: error: format '%d' expects argument of type > 'int', but argument 3 has type 'long unsigned int' [-Werror=format=] >dev_err(cryp->dev, "dma %d bytes error\n", sizeof(*info)); > crypto/mediatek/mtk-sha.c:344:28: error: format '%d' expects argument of type > 'int', but argument 3 has type 'long unsigned int' [-Werror=format=] > crypto/mediatek/mtk-sha.c:550:21: error: format '%u' expects argument of type > 'unsigned int', but argument 4 has type 'size_t {aka long unsigned int}' > [-Werror=format=] > > The correct format for size_t is %zu, so use that in all three > cases. > > Fixes: 785e5c616c84 ("crypto: mediatek - Add crypto driver support for some > MediaTek chips") > Signed-off-by: Arnd Bergmann Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/7] crypto: ARM/arm64 - AES and ChaCha20 updates for v4.11
On Wed, Jan 11, 2017 at 04:41:48PM +, Ard Biesheuvel wrote: > This adds ARM and arm64 implementations of ChaCha20, scalar AES and SIMD > AES (using bit slicing). The SIMD algorithms in this series take advantage > of the new skcipher walksize attribute to iterate over the input in the most > efficient manner possible. > > Patch #1 adds a NEON implementation of ChaCha20 for ARM. > > Patch #2 adds a NEON implementation of ChaCha20 for arm64. > > Patch #3 modifies the existing NEON and ARMv8 Crypto Extensions > implementations > of AES-CTR to be available as a synchronous skcipher as well. This is intended > for the mac80211 code, which uses synchronous encapsulations of ctr(aes) > [ccm, gcm] in softirq context, during which arm64 supports use of SIMD code. > > Patch #4 adds a scalar implementation of AES for arm64, using the key schedule > generation routines and lookup tables of the generic code in > crypto/aes_generic. > > Patch #5 does the same for ARM, replacing existing scalar code that originated > in the OpenSSL project, and contains redundant key schedule generation > routines > and lookup tables (and is slightly slower on modern cores) > > Patch #6 replaces the ARM bit sliced NEON code with a new implementation that > has a number of advantages over the original code (which also originated in > the > OpenSSL project.) The performance should be identical. > > Patch #7 adds a port of the ARM bit-sliced AES code to arm64, in ECB, CBC, CTR > and XTS modes. > > Due to the size of patch #7, it may be difficult to apply these patches from > patchwork, so I pushed them here as well: It seems to have made it. All applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] crypto: mediatek - remove ARM dependencies
On Wed, Jan 11, 2017 at 02:50:19PM +0100, Arnd Bergmann wrote: > Building the mediatek driver on an older ARM architecture results in a > harmless warning: > > warning: (ARCH_OMAP2PLUS_TYPICAL && CRYPTO_DEV_MEDIATEK) selects NEON which > has unmet direct dependencies (VFPv3 && CPU_V7) > > We could add an explicit dependency on CPU_V7, but it seems nicer to > open up the build to additional configurations. This replaces the ARM > optimized algorithm selection with the normal one that all other drivers > use, and that in turn lets us relax the dependency on ARM and drop > a number of the unrelated 'select' statements. > > Obviously a real user would still select those other optimized drivers > as a fallback, but as there is no strict dependency, we can leave that > up to the user. > > Fixes: 785e5c616c84 ("crypto: mediatek - Add crypto driver support for some > MediaTek chips") > Signed-off-by: Arnd Bergmann Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 8/8] crypto/testmgr: Allocate only the required output size for hash tests
On Tue, Jan 10, 2017 at 03:24:46PM -0800, Andy Lutomirski wrote: > There are some hashes (e.g. sha224) that have some internal trickery > to make sure that only the correct number of output bytes are > generated. If something goes wrong, they could potentially overrun > the output buffer. > > Make the test more robust by allocating only enough space for the > correct output size so that memory debugging will catch the error if > the output is overrun. > > Tested by intentionally breaking sha224 to output all 256 > internally-generated bits while running on KASAN. > > Cc: Ard Biesheuvel > Cc: Herbert Xu > Signed-off-by: Andy Lutomirski Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] crypto: mediatek: don't return garbage err on successful return
On Tue, Jan 03, 2017 at 01:21:22PM +, Colin King wrote: > From: Colin Ian King > > In the case where keylen <= bs mtk_sha_setkey returns an uninitialized > return value in err. Fix this by returning 0 instead of err. > > Issue detected by static analysis with cppcheck. > > Signed-off-by: Colin Ian King Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] crypto: picoxcell - Cleanups removing non-DT code
On Mon, Jan 02, 2017 at 02:06:56PM -0300, Javier Martinez Canillas wrote: > Hello, > > This small series contains a couple of cleanups that removes some driver's > code > that isn't needed due the driver being for a DT-only platform. > > The changes were suggested by Arnd Bergmann as a response to a previous patch: > https://lkml.org/lkml/2017/1/2/342 > > Patch #1 allows the driver to be built when the COMPILE_TEST option is > enabled. > Patch #2 removes the platform ID table since isn't needed for DT-only drivers. > Patch #3 removes a wrapper function that's also not needed if driver is > DT-only. All applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] crypto: Replaced gcc specific attributes with macros from compiler.h
On Sat, Dec 31, 2016 at 09:26:23PM +0530, gidisr...@gmail.com wrote: > From: Gideon Israel Dsouza > > Continuing from this commit: 52f5684c8e1e > ("kernel: use macros from compiler.h instead of __attribute__((...))") > > I submitted 4 total patches. They are part of task I've taken up to > increase compiler portability in the kernel. I've cleaned up the > subsystems under /kernel /mm /block and /security, this patch targets > /crypto. > > There is which provides macros for various gcc specific > constructs. Eg: __weak for __attribute__((weak)). I've cleaned all > instances of gcc specific attributes with the right macros for the crypto > subsystem. > > I had to make one additional change into compiler-gcc.h for the case when > one wants to use this: __attribute__((aligned) and not specify an alignment > factor. From the gcc docs, this will result in the largest alignment for > that data type on the target machine so I've named the macro > __aligned_largest. Please advise if another name is more appropriate. > > Signed-off-by: Gideon Israel Dsouza Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] crypto: testmgr - use kmemdup instead of kmalloc+memcpy
On Fri, Dec 30, 2016 at 02:12:00PM -0600, Eric Biggers wrote: > From: Eric Biggers > > It's recommended to use kmemdup instead of kmalloc followed by memcpy. > > Signed-off-by: Eric Biggers Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v1 3/8] crypto:chcr- Fix key length for RFC4106
On 12-01-2017 21:39, Herbert Xu wrote: > On Fri, Jan 06, 2017 at 02:01:34PM +0530, Harsh Jain wrote: >> Check keylen before copying salt to avoid wrap around of Integer. >> >> Signed-off-by: Harsh Jain >> --- >> drivers/crypto/chelsio/chcr_algo.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/crypto/chelsio/chcr_algo.c >> b/drivers/crypto/chelsio/chcr_algo.c >> index deec7c0..6c2dea3 100644 >> --- a/drivers/crypto/chelsio/chcr_algo.c >> +++ b/drivers/crypto/chelsio/chcr_algo.c >> @@ -2194,8 +2194,8 @@ static int chcr_gcm_setkey(struct crypto_aead *aead, >> const u8 *key, >> unsigned int ck_size; >> int ret = 0, key_ctx_size = 0; >> >> -if (get_aead_subtype(aead) == >> -CRYPTO_ALG_SUB_TYPE_AEAD_RFC4106) { >> +if (get_aead_subtype(aead) == CRYPTO_ALG_SUB_TYPE_AEAD_RFC4106 && >> +keylen > 3) { >> keylen -= 4; /* nonce/salt is present in the last 4 bytes */ >> memcpy(aeadctx->salt, key + keylen, 4); >> } > We should return an error in this case. That case is already handled in next if condition.It will error out with -EINVAL in next condition. if (keylen == AES_KEYSIZE_128) { > > Cheers, -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/13] crypto: copy AAD during encrypt for AEAD ciphers
Am Freitag, 13. Januar 2017, 00:06:29 CET schrieb Herbert Xu: Hi Herbert, > On Thu, Jan 12, 2017 at 05:01:13PM +0100, Stephan Müller wrote: > > I fully agree. Therefore, I was under the impression that disregarding the > > AAD in recvmsg entirely would be most appropriate as offered with the > > patch "crypto: AF_ALG - disregard AAD buffer space for output". In this > > case we would be fully POSIX compliant, the kernel would not copy the AAD > > (and thus perform multiple memcpy operations due to copy_from_user and > > copy_to_user round trips) and leave the AAD copy operation entirely to > > user space. > Yes but then you'd have to play nasty games to fit this through > the kernel API. I would not understand that statement. With the patch mentioned above that I provided some weeks ago, we have the following scenario for an encryption (in case of decryption, it is almost identical, just the tag location is reversed): user calls sendmsg with data buffer/IOVEC: AAD || PT -> algif_aead turns this into the src SGL user calls recvmsg with data buffer/IOVEC: CT || Tag -> algif_aead creates the first SG entry in the dst SGL pointing to the AAD from the src SGL -> algif_aead appends the user buffers to the dst SGL -> algif_aead performs its operation and during that operation, only the CT and Tag parts are changed I.e. with the pre-pending of the SG pointing to the AAD from the src SGL to the dst SGL we have a clean invocation of the kernel API. Yet, we avoid copy round trips of the AAD from src to dst in the kernel. > Besides, we could still do in-place crypto even > though you suggested that it's complicated. Is that crypto-in-place operation really a goal we want to achieve if we "buy" it with a memcpy of the data from the src SGL to the dst SGL before the crypto operation? Can we really call such implementation a crypto-in-place operation? > It's not really. I concur, for encryption the suggested memcpy is not difficult. Only for decryption, the memcpy is more challenging. > All we have to do is walk through the SG list and compare each > page/offset. For the common case it's going to be a single-entry > list. > > Cheers, Ciao Stephan -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/13] crypto: copy AAD during encrypt for AEAD ciphers
On Thu, Jan 12, 2017 at 04:50:46PM +0100, Stephan Müller wrote: > > So you say that we could remove it from authenc() entirely (this is currently > the only location where such copy operation is found for the encryption > direction)? > > I would concur that the kernel does not need that. authenc needs it for performance reasons. The kernel API as it stands basically says that it may or may not copy the AAD. If you choose to have a distinct AAD in the dst SG list then either you throw away the result or you copy it yourself. > If we only want to solve that for algif_aead, wouldn't it make more sense if > the user space caller takes care of that (such as libkcapi)? By tinkering > with > the SGLs and copying the data to the dst buffer before the cipher operation > takes place, I guess we will add performance degradation and more complexity > in the kernel. > > Having such logic in user space would keep the algif_aead cleaner IMHO. We need to have a sane kernel API that respects POSIX. Cheers, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] fix itnull.cocci warnings
On Sat, Jan 07, 2017 at 10:46:17AM +0100, Julia Lawall wrote: > The first argument to list_for_each_entry cannot be NULL. > > Generated by: scripts/coccinelle/iterators/itnull.cocci > > CC: Harsh Jain > Signed-off-by: Julia Lawall > Signed-off-by: Fengguang Wu > --- > > This code comes from the following git tree: > > url: > https://github.com/0day-ci/linux/commits/Harsh-Jain/crypto-chcr-Bug-fixes/20170107-093356 > base: > https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git > master > In-Reply-To: > <8e0086b56d8fb61637d179c32a09a1bca03c4186.1483599449.git.ha...@chelsio.com> Harsh, please fold this patch into your series when you resubmit. Thanks, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] crypto: aead AF_ALG - overhaul memory management
Am Freitag, 13. Januar 2017, 00:17:59 CET schrieb Herbert Xu: Hi Herbert, > On Thu, Jan 12, 2017 at 05:10:14PM +0100, Stephan Müller wrote: > > Each IOCB would transpire into an independent, separate recvmsg invocation > > without an additional sendmsg/sendpage operation. Thus, in order to > > support > > multiple IOCBs, all data the multiple recvmsg invocations will operate on > > must be injected into the kernel beforehand. > > I don't understand, what's wrong with: > > sendmsg(fd, ...) > aio_read(iocb1) > sendmsg(fd, ...) > aio_read(iocb2) Sure, that works. But here you limit yourself to one IOCB per aio_read. But aio_read supports multiple IOCBs in one invocation. And this is the issue I am considering. Ciao Stephan -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] crypto: aead AF_ALG - overhaul memory management
On Thu, Jan 12, 2017 at 05:10:14PM +0100, Stephan Müller wrote: > > Each IOCB would transpire into an independent, separate recvmsg invocation > without an additional sendmsg/sendpage operation. Thus, in order to support > multiple IOCBs, all data the multiple recvmsg invocations will operate on > must > be injected into the kernel beforehand. I don't understand, what's wrong with: sendmsg(fd, ...) aio_read(iocb1) sendmsg(fd, ...) aio_read(iocb2) Cheers, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v1 0/8] crypto:chcr- Bug fixes
On Fri, Jan 06, 2017 at 02:01:31PM +0530, Harsh Jain wrote: > The patch series is based on Herbert's cryptodev-2.6 tree. > It include bug fixes. > > Atul Gupta (4): > crypto:chcr-Change flow IDs > crypto:chcr- Fix panic on dma_unmap_sg > crypto:chcr- Check device is allocated before use > crypto:chcr- Fix wrong typecasting > Harsh Jain (4): > crypto:chcr- Fix key length for RFC4106 > crypto:chcr- Use cipher instead of Block Cipher in gcm setkey > crypto:chcr: Change cra_flags for cipher algos > crypto:chcr- Change algo priority When you resubmit this please split it into two series. Please send the critical bug fixes (panic + key length + alloc check) in one series separate from the others. This way I can push them easily to the 4.10 tree. Thanks, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] crypto: aead AF_ALG - overhaul memory management
Am Freitag, 13. Januar 2017, 00:07:39 CET schrieb Herbert Xu: Hi Herbert, > On Thu, Jan 12, 2017 at 05:05:03PM +0100, Stephan Müller wrote: > > Am Donnerstag, 12. Januar 2017, 16:56:04 CET schrieb Stephan Müller: > > > > Hi Herbert, > > > > > That would mean that we would only support one IOCB. > > > > As we also need to be standards compliant, would it be appropriate to only > > support one IOCB? I think this is a significant difference to other AIO > > operations like for networking. > > Why would we be limited to one IOCB? Each IOCB would transpire into an independent, separate recvmsg invocation without an additional sendmsg/sendpage operation. Thus, in order to support multiple IOCBs, all data the multiple recvmsg invocations will operate on must be injected into the kernel beforehand. Ciao Stephan -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v1 3/8] crypto:chcr- Fix key length for RFC4106
On Fri, Jan 06, 2017 at 02:01:34PM +0530, Harsh Jain wrote: > Check keylen before copying salt to avoid wrap around of Integer. > > Signed-off-by: Harsh Jain > --- > drivers/crypto/chelsio/chcr_algo.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/crypto/chelsio/chcr_algo.c > b/drivers/crypto/chelsio/chcr_algo.c > index deec7c0..6c2dea3 100644 > --- a/drivers/crypto/chelsio/chcr_algo.c > +++ b/drivers/crypto/chelsio/chcr_algo.c > @@ -2194,8 +2194,8 @@ static int chcr_gcm_setkey(struct crypto_aead *aead, > const u8 *key, > unsigned int ck_size; > int ret = 0, key_ctx_size = 0; > > - if (get_aead_subtype(aead) == > - CRYPTO_ALG_SUB_TYPE_AEAD_RFC4106) { > + if (get_aead_subtype(aead) == CRYPTO_ALG_SUB_TYPE_AEAD_RFC4106 && > + keylen > 3) { > keylen -= 4; /* nonce/salt is present in the last 4 bytes */ > memcpy(aeadctx->salt, key + keylen, 4); > } We should return an error in this case. Cheers, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] crypto: aead AF_ALG - overhaul memory management
On Thu, Jan 12, 2017 at 05:05:03PM +0100, Stephan Müller wrote: > Am Donnerstag, 12. Januar 2017, 16:56:04 CET schrieb Stephan Müller: > > Hi Herbert, > > > > > That would mean that we would only support one IOCB. > > As we also need to be standards compliant, would it be appropriate to only > support one IOCB? I think this is a significant difference to other AIO > operations like for networking. Why would we be limited to one IOCB? -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/13] crypto: copy AAD during encrypt for AEAD ciphers
On Thu, Jan 12, 2017 at 05:01:13PM +0100, Stephan Müller wrote: > > I fully agree. Therefore, I was under the impression that disregarding the > AAD > in recvmsg entirely would be most appropriate as offered with the patch > "crypto: AF_ALG - disregard AAD buffer space for output". In this case we > would be fully POSIX compliant, the kernel would not copy the AAD (and thus > perform multiple memcpy operations due to copy_from_user and copy_to_user > round trips) and leave the AAD copy operation entirely to user space. Yes but then you'd have to play nasty games to fit this through the kernel API. Besides, we could still do in-place crypto even though you suggested that it's complicated. It's not really. All we have to do is walk through the SG list and compare each page/offset. For the common case it's going to be a single-entry list. Cheers, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] crypto: aead AF_ALG - overhaul memory management
Am Donnerstag, 12. Januar 2017, 16:56:04 CET schrieb Stephan Müller: Hi Herbert, > > That would mean that we would only support one IOCB. As we also need to be standards compliant, would it be appropriate to only support one IOCB? I think this is a significant difference to other AIO operations like for networking. Ciao Stephan -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/13] crypto: copy AAD during encrypt for AEAD ciphers
Am Donnerstag, 12. Januar 2017, 23:53:44 CET schrieb Herbert Xu: Hi Herbert, > > > If we only want to solve that for algif_aead, wouldn't it make more sense > > if the user space caller takes care of that (such as libkcapi)? By > > tinkering with the SGLs and copying the data to the dst buffer before the > > cipher operation takes place, I guess we will add performance degradation > > and more complexity in the kernel. > > > > Having such logic in user space would keep the algif_aead cleaner IMHO. > > We need to have a sane kernel API that respects POSIX. I fully agree. Therefore, I was under the impression that disregarding the AAD in recvmsg entirely would be most appropriate as offered with the patch "crypto: AF_ALG - disregard AAD buffer space for output". In this case we would be fully POSIX compliant, the kernel would not copy the AAD (and thus perform multiple memcpy operations due to copy_from_user and copy_to_user round trips) and leave the AAD copy operation entirely to user space. Ciao Stephan -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] crypto: aead AF_ALG - overhaul memory management
Am Donnerstag, 12. Januar 2017, 23:51:28 CET schrieb Herbert Xu: Hi Herbert, > On Sun, Dec 25, 2016 at 06:15:06PM +0100, Stephan Müller wrote: > > + * The following concept of the memory management is used: > > + * > > + * The kernel maintains two SGLs, the TX SGL and the RX SGL. The TX SGL > > is > > + * filled by user space with the data submitted via sendpage/sendmsg. > > Filling + * up the TX SGL does not cause a crypto operation -- the data > > will only be + * tracked by the kernel. Upon receipt of one recvmsg call, > > the caller must + * provide a buffer which is tracked with the RX SGL. > > + * > > + * During the processing of the recvmsg operation, the cipher request is > > + * allocated and prepared. To support multiple recvmsg operations > > operating + * on one TX SGL, an offset pointer into the TX SGL is > > maintained. The TX SGL + * that is used for the crypto request is > > scatterwalk_ffwd by the offset + * pointer to obtain the start address > > the crypto operation shall use for + * the input data. > > I think this is overcomplicating things. The async interface > should be really simple. It should be exactly the same as the > sync interface, except that completion is out-of-line. > > So there should be no mixing of SGLs from different requests. > Just start with a clean slate after each recvmsg regardless of > whether it's sync or async. > > The only difference in the async case is that you need to keep a > reference to the old pages and free them upon completion. But this > should in no way interfere with subsequent requests. That would mean that we would only support one IOCB. At least with algif_skcipher, having multiple IOCBs would reduce the number of system calls user space needs to make for multiple plaintext / ciphertext blocks. But then, with the use of IOVECs, user space could provide all input data with one system call anyway. Ok, I will update the patch as suggested. > > Cheers, Ciao Stephan -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] crypto: aead AF_ALG - overhaul memory management
On Sun, Dec 25, 2016 at 06:15:06PM +0100, Stephan Müller wrote: > > + * The following concept of the memory management is used: > + * > + * The kernel maintains two SGLs, the TX SGL and the RX SGL. The TX SGL is > + * filled by user space with the data submitted via sendpage/sendmsg. Filling > + * up the TX SGL does not cause a crypto operation -- the data will only be > + * tracked by the kernel. Upon receipt of one recvmsg call, the caller must > + * provide a buffer which is tracked with the RX SGL. > + * > + * During the processing of the recvmsg operation, the cipher request is > + * allocated and prepared. To support multiple recvmsg operations operating > + * on one TX SGL, an offset pointer into the TX SGL is maintained. The TX SGL > + * that is used for the crypto request is scatterwalk_ffwd by the offset > + * pointer to obtain the start address the crypto operation shall use for > + * the input data. I think this is overcomplicating things. The async interface should be really simple. It should be exactly the same as the sync interface, except that completion is out-of-line. So there should be no mixing of SGLs from different requests. Just start with a clean slate after each recvmsg regardless of whether it's sync or async. The only difference in the async case is that you need to keep a reference to the old pages and free them upon completion. But this should in no way interfere with subsequent requests. Cheers, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/13] crypto: copy AAD during encrypt for AEAD ciphers
Am Donnerstag, 12. Januar 2017, 23:39:24 CET schrieb Herbert Xu: Hi Herbert, > On Thu, Jan 12, 2017 at 04:34:39PM +0100, Stephan Müller wrote: > > We would only be able to remove it if all AEAD implementations are > > converted. But for the conversion time, we do face that issue. > > It doesn't matter. Nobody in the kernel uses that. In fact I > wonder whether we should even do it for the kernel API. We only > need it for the user-space API because it goes through read/write. So you say that we could remove it from authenc() entirely (this is currently the only location where such copy operation is found for the encryption direction)? I would concur that the kernel does not need that. > > > Are you suggesting that the entire data in the src SGL is first copied to > > the dst SGL by algif_aead? If yes, that still requires significant > > src/dst SGL tinkering as we have the tag -- the src SGL for encrypt does > > not have the tag space where the dst SGL for encrypt is required to have > > the tag size. This is vice versa for the decryption operation. > > It's really only a problem for decryption. In that case you can > extend the dst SG list to include the tag. If we only want to solve that for algif_aead, wouldn't it make more sense if the user space caller takes care of that (such as libkcapi)? By tinkering with the SGLs and copying the data to the dst buffer before the cipher operation takes place, I guess we will add performance degradation and more complexity in the kernel. Having such logic in user space would keep the algif_aead cleaner IMHO. Ciao Stephan -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] crypto: kpp - clear CRYPTO_ALG_DEAD bit in prepare_alg
On Mon, Jan 02, 2017 at 01:33:41PM +, Salvatore Benedetto wrote: > Make sure CRYPTO_ALG_DEAD is not set when preparing for > alg registration. This fixes qat-dh registration that occurs > when reloading qat_c62x module. > > Signed-off-by: Salvatore Benedetto > --- > crypto/kpp.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/crypto/kpp.c b/crypto/kpp.c > index d36ce05..d1adef8e 100644 > --- a/crypto/kpp.c > +++ b/crypto/kpp.c > @@ -101,6 +101,7 @@ static void kpp_prepare_alg(struct kpp_alg *alg) > > base->cra_type = &crypto_kpp_type; > base->cra_flags &= ~CRYPTO_ALG_TYPE_MASK; > + base->cra_flags &= ~CRYPTO_ALG_DEAD; > base->cra_flags |= CRYPTO_ALG_TYPE_KPP; > } Same comment as in https://patchwork.kernel.org/patch/9485115/ Cheers, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/13] crypto: copy AAD during encrypt for AEAD ciphers
On Thu, Jan 12, 2017 at 04:34:39PM +0100, Stephan Müller wrote: > > We would only be able to remove it if all AEAD implementations are converted. > But for the conversion time, we do face that issue. It doesn't matter. Nobody in the kernel uses that. In fact I wonder whether we should even do it for the kernel API. We only need it for the user-space API because it goes through read/write. > Are you suggesting that the entire data in the src SGL is first copied to the > dst SGL by algif_aead? If yes, that still requires significant src/dst SGL > tinkering as we have the tag -- the src SGL for encrypt does not have the tag > space where the dst SGL for encrypt is required to have the tag size. This is > vice versa for the decryption operation. It's really only a problem for decryption. In that case you can extend the dst SG list to include the tag. Cheers, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/13] crypto: copy AAD during encrypt for AEAD ciphers
Am Donnerstag, 12. Januar 2017, 23:27:10 CET schrieb Herbert Xu: Hi Herbert, > On Thu, Jan 12, 2017 at 04:23:50PM +0100, Stephan Müller wrote: > > As far as I understand, we have the following AAD copy operations that we > > discuss: > > > > - algif_aead: you suggested to add the AAD copy operation here > > > > - AEAD implementations: over time, the AEAD implementations shall receive > > the AAD copy operation. The AAD copy operation shall only take place if > > the src SGL != dst SGL > > If and when that happens we'd simply need to remove the copy from > algif_aead code. We would only be able to remove it if all AEAD implementations are converted. But for the conversion time, we do face that issue. > But even if we didn't you wouldn't have two copies > because algif_aead should invoke the in-place code-path after doing > a full copy of src to dst. Are you suggesting that the entire data in the src SGL is first copied to the dst SGL by algif_aead? If yes, that still requires significant src/dst SGL tinkering as we have the tag -- the src SGL for encrypt does not have the tag space where the dst SGL for encrypt is required to have the tag size. This is vice versa for the decryption operation. And to me the most elegant solution seems adding the copy operation to crypto_aead_[en|de]crypt. Ciao Stephan -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/13] crypto: copy AAD during encrypt for AEAD ciphers
On Thu, Jan 12, 2017 at 04:23:50PM +0100, Stephan Müller wrote: > > As far as I understand, we have the following AAD copy operations that we > discuss: > > - algif_aead: you suggested to add the AAD copy operation here > > - AEAD implementations: over time, the AEAD implementations shall receive the > AAD copy operation. The AAD copy operation shall only take place if the src > SGL != dst SGL If and when that happens we'd simply need to remove the copy from algif_aead code. But even if we didn't you wouldn't have two copies because algif_aead should invoke the in-place code-path after doing a full copy of src to dst. Cheers, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/13] crypto: copy AAD during encrypt for AEAD ciphers
Am Donnerstag, 12. Januar 2017, 22:57:28 CET schrieb Herbert Xu: Hi Herbert, > On Thu, Jan 12, 2017 at 12:22:09PM +0100, Stephan Müller wrote: > > When addressing the issue in the algif_aead code, and expect that over > > time > > the AEAD implementations will gain the copy operation, eventually we will > > copy the AAD twice. Of course, this could be prevented, if the algif_aead > > code somehow uses the same SGL for the src and dst AAD. > > Why would you copy it twice? You copy everything before you start > and then just do in-place crypto. > As far as I understand, we have the following AAD copy operations that we discuss: - algif_aead: you suggested to add the AAD copy operation here - AEAD implementations: over time, the AEAD implementations shall receive the AAD copy operation. The AAD copy operation shall only take place if the src SGL != dst SGL The algif_aead code *always* will have the src SGL being different from the dst SGL. Thus, any existing AEAD implementation copy will always be performed which would be in addition to the algif_aead AAD copy operation. We can only avoid the AEAD implementation copy operation, if we add some code to algif_aead to make sure that the AAD data is pointed to by src/dst SGL that is identical. However, such logic to make src/dst SGL identical is quite complex compared to simply use one callback as suggested in the current patch set. In the followup email, I suggested to add the AAD copy function invocation into crypto_aead_encrypt. This way, we can drop the numerous patches to the AEAD implementations and yet we can avoid adding such copy operation and src/ dst SGL unification logic to algif_aead. > > > BTW, why are you only doing the copy for encryption? > > > > I was looking at the only AEAD implementation that does the copy > > operation: > > authenc. There, the copy operation is only performed for encryption. I was > > thinking a bit about why decryption was not covered. I think the answer is > > the following: for encryption, the AAD is definitely needed in the dst > > buffer as the dst buffer with the AAD must be sent to the recipient for > > decryption. The decryption and the associated authentication only works > > with the AAD. However, after decrypting, all the caller wants is the > > decrypted plaintext only. There is no further use of the AAD after the > > decryption step. Hence, copying the AAD to the dst buffer in the > > decryption step would not serve the caller. > That's just the current implementation. If we're going to make > this an API then we should do it for both directions. Considering the suggestion above to add the AAD copy call to crypto_aead_encrypt, we can add such copy call also to crypto_aead_decrypt. Ciao Stephan -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: x86-64: Maintain 16-byte stack alignment
On Thu, Jan 12, 2017 at 11:06:50PM +0800, Herbert Xu wrote: > On Thu, Jan 12, 2017 at 09:03:18AM -0600, Josh Poimboeuf wrote: > > > > For the entry code, could we just replace all calls with CALL_ALIGNED? > > That might be less intrusive than trying to adjust all the pt_regs > > accesses. > > > > Then to ensure that nobody ever uses 'call' directly: > > > > '#define call please-use-CALL-ALIGNED-instead-of-call' > > > > I think that would be possible if CALL_ALIGNED were a ".macro". > > The trouble with that is that you have got things like TRACE_IRQS_OFF > which are also used outside of the entry code. Where? As far as I can tell, TRACE_IRQS_OFF is used exclusively by entry code. -- Josh -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: x86-64: Maintain 16-byte stack alignment
On Thu, Jan 12, 2017 at 09:03:18AM -0600, Josh Poimboeuf wrote: > On Wed, Jan 11, 2017 at 11:51:10PM -0800, Andy Lutomirski wrote: > > On Wed, Jan 11, 2017 at 11:05 PM, Herbert Xu > > wrote: > > > On Tue, Jan 10, 2017 at 09:05:28AM -0800, Linus Torvalds wrote: > > >> > > >> I'm pretty sure we have random asm code that may not maintain a > > >> 16-byte stack alignment when it calls other code (including, in some > > >> cases, calling C code). > > >> > > >> So I'm not at all convinced that this is a good idea. We shouldn't > > >> expect 16-byte alignment to be something trustworthy. > > > > > > So what if we audited all the x86 assembly code to fix this? Would > > > it then be acceptable to do a 16-byte aligned stack? > > > > > > On the face of it it doesn't seem to be a huge amount of code > > > assuming they mostly live under arch/x86. > > > > The problem is that we have nasties like TRACE_IRQS_OFF. Performance > > doesn't really matter for these macros, so we could probably rig up a > > helper for forcibly align the stack there. Maybe > > FRAME_BEGIN_FORCE_ALIGN? I also think I'd rather not to modify > > pt_regs. We should just fix the small number of code paths that > > create a pt_regs and then call into C code to align the stack. > > > > But if we can't do this with automatic verification, then I'm not sure > > I want to do it at all. The asm is already more precarious than I'd > > like, and having a code path that is misaligned is asking for obscure > > bugs down the road. > > For the entry code, could we just replace all calls with CALL_ALIGNED? > That might be less intrusive than trying to adjust all the pt_regs > accesses. > > Then to ensure that nobody ever uses 'call' directly: > > '#define call please-use-CALL-ALIGNED-instead-of-call' > > I think that would be possible if CALL_ALIGNED were a ".macro". To clarify, CALL_ALIGNED could be (completely untested): .macro CALL_ALIGNED \func push%rbp movq%rsp, %rbp and $0xfff0,%rsp call\func movq%rbp, %rsp pop %rbp .endm -- Josh -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: x86-64: Maintain 16-byte stack alignment
On Wed, Jan 11, 2017 at 11:51:10PM -0800, Andy Lutomirski wrote: > On Wed, Jan 11, 2017 at 11:05 PM, Herbert Xu > wrote: > > On Tue, Jan 10, 2017 at 09:05:28AM -0800, Linus Torvalds wrote: > >> > >> I'm pretty sure we have random asm code that may not maintain a > >> 16-byte stack alignment when it calls other code (including, in some > >> cases, calling C code). > >> > >> So I'm not at all convinced that this is a good idea. We shouldn't > >> expect 16-byte alignment to be something trustworthy. > > > > So what if we audited all the x86 assembly code to fix this? Would > > it then be acceptable to do a 16-byte aligned stack? > > > > On the face of it it doesn't seem to be a huge amount of code > > assuming they mostly live under arch/x86. > > The problem is that we have nasties like TRACE_IRQS_OFF. Performance > doesn't really matter for these macros, so we could probably rig up a > helper for forcibly align the stack there. Maybe > FRAME_BEGIN_FORCE_ALIGN? I also think I'd rather not to modify > pt_regs. We should just fix the small number of code paths that > create a pt_regs and then call into C code to align the stack. > > But if we can't do this with automatic verification, then I'm not sure > I want to do it at all. The asm is already more precarious than I'd > like, and having a code path that is misaligned is asking for obscure > bugs down the road. For the entry code, could we just replace all calls with CALL_ALIGNED? That might be less intrusive than trying to adjust all the pt_regs accesses. Then to ensure that nobody ever uses 'call' directly: '#define call please-use-CALL-ALIGNED-instead-of-call' I think that would be possible if CALL_ALIGNED were a ".macro". -- Josh -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH -next] crypto: mediatek - make symbol of_crypto_id static
From: Wei Yongjun Fixes the following sparse warning: drivers/crypto/mediatek/mtk-platform.c:585:27: warning: symbol 'of_crypto_id' was not declared. Should it be static? Signed-off-by: Wei Yongjun --- drivers/crypto/mediatek/mtk-platform.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/crypto/mediatek/mtk-platform.c b/drivers/crypto/mediatek/mtk-platform.c index 286296f..a9c713d 100644 --- a/drivers/crypto/mediatek/mtk-platform.c +++ b/drivers/crypto/mediatek/mtk-platform.c @@ -582,7 +582,7 @@ static int mtk_crypto_remove(struct platform_device *pdev) return 0; } -const struct of_device_id of_crypto_id[] = { +static const struct of_device_id of_crypto_id[] = { { .compatible = "mediatek,eip97-crypto" }, {}, }; -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: x86-64: Maintain 16-byte stack alignment
On Thu, Jan 12, 2017 at 09:03:18AM -0600, Josh Poimboeuf wrote: > > For the entry code, could we just replace all calls with CALL_ALIGNED? > That might be less intrusive than trying to adjust all the pt_regs > accesses. > > Then to ensure that nobody ever uses 'call' directly: > > '#define call please-use-CALL-ALIGNED-instead-of-call' > > I think that would be possible if CALL_ALIGNED were a ".macro". The trouble with that is that you have got things like TRACE_IRQS_OFF which are also used outside of the entry code. Cheers, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/13] crypto: copy AAD during encrypt for AEAD ciphers
On Thu, Jan 12, 2017 at 12:22:09PM +0100, Stephan Müller wrote: > > When addressing the issue in the algif_aead code, and expect that over time > the AEAD implementations will gain the copy operation, eventually we will > copy > the AAD twice. Of course, this could be prevented, if the algif_aead code > somehow uses the same SGL for the src and dst AAD. Why would you copy it twice? You copy everything before you start and then just do in-place crypto. > > BTW, why are you only doing the copy for encryption? > > I was looking at the only AEAD implementation that does the copy operation: > authenc. There, the copy operation is only performed for encryption. I was > thinking a bit about why decryption was not covered. I think the answer is > the > following: for encryption, the AAD is definitely needed in the dst buffer as > the dst buffer with the AAD must be sent to the recipient for decryption. The > decryption and the associated authentication only works with the AAD. > However, > after decrypting, all the caller wants is the decrypted plaintext only. There > is no further use of the AAD after the decryption step. Hence, copying the > AAD > to the dst buffer in the decryption step would not serve the caller. That's just the current implementation. If we're going to make this an API then we should do it for both directions. Cheers, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: x86-64: Maintain 16-byte stack alignment
On Thu, Jan 12, 2017 at 08:46:01AM +0100, Ingo Molnar wrote: > > * Herbert Xu wrote: > > > On Tue, Jan 10, 2017 at 09:05:28AM -0800, Linus Torvalds wrote: > > > > > > I'm pretty sure we have random asm code that may not maintain a > > > 16-byte stack alignment when it calls other code (including, in some > > > cases, calling C code). > > > > > > So I'm not at all convinced that this is a good idea. We shouldn't > > > expect 16-byte alignment to be something trustworthy. > > > > So what if we audited all the x86 assembly code to fix this? Would > > it then be acceptable to do a 16-byte aligned stack? > > Audits for small but deadly details that isn't checked automatically by > tooling > would inevitably bitrot again - and in this particular case there's a 50% > chance > that a new, buggy change would test out to be 'fine' on a kernel developer's > own > box - and break on different configs, different hw or with unrelated (and > innocent) kernel changes, sometime later - spreading the pain unnecessarily. > > So my feeling is that we really need improved tooling for this (and yes, the > GCC > toolchain should have handled this correctly). > > But fortunately we have related tooling in the kernel: could objtool handle > this? > My secret hope was always that objtool would grow into a kind of life > insurance > against toolchain bogosities (which is a must for things like livepatching or > a > DWARF unwinder - but I digress). Are we talking about entry code, or other asm code? Because objtool audits *most* asm code, but entry code is way too specialized for objtool to understand. (I do have a pending objtool rewrite which would make it very easy to ensure 16-byte stack alignment. But again, objtool can only understand callable C or asm functions, not entry code.) Another approach would be to solve this problem with unwinder warnings, *if* there's enough test coverage. I recently made some changes to try to standardize the "end" of the stack, so that the stack pointer is always a certain value before calling into C code. I also added some warnings to the unwinder to ensure that it always reaches that point on the stack. So if the "end" of the stack were adjusted by a word by adding padding to pt_regs, the unwinder warnings could help preserve that. We could take that a step further by adding an unwinder check to ensure that *every* frame is 16-byte aligned if -mpreferred-stack-boundary=3 isn't used. Yet another step would be to add a debug feature which does stack sanity checking from a periodic NMI, to flush out these unwinder warnings. (Though I've found that current 0-day and fuzzing efforts, combined with lockdep and perf's frequent unwinder usage, are already doing a great job at flushing out unwinder warnings.) The only question is if there would be enough test coverage, particularly with those versions of gcc which don't have -mpreferred-stack-boundary=3. -- Josh -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 1/1] crypto: add virtio-crypto driver
On Thu, Jan 12, 2017 at 03:10:25PM +0100, Christian Borntraeger wrote: > On 01/10/2017 01:56 PM, Christian Borntraeger wrote: > > On 01/10/2017 01:36 PM, Gonglei (Arei) wrote: > >> Hi, > >> > >>> > >>> On 12/15/2016 03:03 AM, Gonglei wrote: > >>> [...] > + > +static struct crypto_alg virtio_crypto_algs[] = { { > +.cra_name = "cbc(aes)", > +.cra_driver_name = "virtio_crypto_aes_cbc", > +.cra_priority = 501, > >>> > >>> > >>> This is still higher than the hardware-accelerators (like intel aesni or > >>> the > >>> s390 cpacf functions or the arm hw). aesni and s390/cpacf are supported > >>> by the > >>> hardware virtualization and available to the guests. I do not see a way > >>> how > >>> virtio > >>> crypto can be faster than that (in the end it might be cpacf/aesni + > >>> overhead) > >>> instead it will very likely be slower. > >>> So we should use a number that is higher than software implementations but > >>> lower than the hw ones. > >>> > >>> Just grepping around, the software ones seem be be around 100 and the > >>> hardware > >>> ones around 200-400. So why was 150 not enough? > >>> > >> I didn't find a documentation about how we use the priority, and I assumed > >> people use virtio-crypto will configure hardware accelerators in the > >> host. So I choosed the number which bigger than aesni's priority. > > > > Yes, but the aesni driver will only bind if there is HW support in the > > guest. > > And if aesni is available in the guest (or the s390 aes function from cpacf) > > it will always be faster than the same in the host via virtio.So your > > priority > > should be smaller. > > > any opinion on this? Going forward, we might add an emulated aesni device and that might become slower than virtio. OTOH if or when this happens, we can solve it by adding a priority or a feature flag to virtio to raise its priority. So I think I agree with Christian here, let's lower the priority. Gonglei, could you send a patch like this? -- MST -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 1/1] crypto: add virtio-crypto driver
On 01/10/2017 01:56 PM, Christian Borntraeger wrote: > On 01/10/2017 01:36 PM, Gonglei (Arei) wrote: >> Hi, >> >>> >>> On 12/15/2016 03:03 AM, Gonglei wrote: >>> [...] + +static struct crypto_alg virtio_crypto_algs[] = { { + .cra_name = "cbc(aes)", + .cra_driver_name = "virtio_crypto_aes_cbc", + .cra_priority = 501, >>> >>> >>> This is still higher than the hardware-accelerators (like intel aesni or the >>> s390 cpacf functions or the arm hw). aesni and s390/cpacf are supported by >>> the >>> hardware virtualization and available to the guests. I do not see a way how >>> virtio >>> crypto can be faster than that (in the end it might be cpacf/aesni + >>> overhead) >>> instead it will very likely be slower. >>> So we should use a number that is higher than software implementations but >>> lower than the hw ones. >>> >>> Just grepping around, the software ones seem be be around 100 and the >>> hardware >>> ones around 200-400. So why was 150 not enough? >>> >> I didn't find a documentation about how we use the priority, and I assumed >> people use virtio-crypto will configure hardware accelerators in the >> host. So I choosed the number which bigger than aesni's priority. > > Yes, but the aesni driver will only bind if there is HW support in the guest. > And if aesni is available in the guest (or the s390 aes function from cpacf) > it will always be faster than the same in the host via virtio.So your priority > should be smaller. any opinion on this? -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 3/3] drivers: crypto: Enable CPT options crypto for build
Hi George, [auto build test WARNING on v4.9-rc8] [cannot apply to cryptodev/master crypto/master sparc-next/master next-20170111] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/George-Cherian/Add-Support-for-Cavium-Cryptographic-Acceleration-Unit/20170112-192240 config: parisc-allyesconfig (attached as .config) compiler: hppa-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705 reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=parisc All warnings (new ones prefixed by >>): drivers/crypto/cavium/cpt/cptvf_reqmanager.c: In function 'process_request': >> drivers/crypto/cavium/cpt/cptvf_reqmanager.c:470:49: warning: format '%lu' >> expects argument of type 'long unsigned int', but argument 3 has type >> 'unsigned int' [-Wformat=] dev_err(&pdev->dev, "mapping compptr Failed %lu\n", ^ vim +470 drivers/crypto/cavium/cpt/cptvf_reqmanager.c 8413476d George Cherian 2017-01-11 454 goto request_cleanup; 8413476d George Cherian 2017-01-11 455 } 8413476d George Cherian 2017-01-11 456 8413476d George Cherian 2017-01-11 457 cpt_req->dlen = info->dlen; 8413476d George Cherian 2017-01-11 458 /* 8413476d George Cherian 2017-01-11 459 * Get buffer for union cpt_res_s response 8413476d George Cherian 2017-01-11 460 * structure and its physical address 8413476d George Cherian 2017-01-11 461 */ 8413476d George Cherian 2017-01-11 462 info->completion_addr = kzalloc(sizeof(union cpt_res_s), 8413476d George Cherian 2017-01-11 463 GFP_KERNEL | GFP_ATOMIC); 8413476d George Cherian 2017-01-11 464 *((u8 *)(info->completion_addr)) = COMPLETION_CODE_INIT; 8413476d George Cherian 2017-01-11 465 info->comp_baddr = dma_map_single(&pdev->dev, 8413476d George Cherian 2017-01-11 466 (void *)info->completion_addr, 8413476d George Cherian 2017-01-11 467 sizeof(union cpt_res_s), 8413476d George Cherian 2017-01-11 468 DMA_BIDIRECTIONAL); 8413476d George Cherian 2017-01-11 469 if (dma_mapping_error(&pdev->dev, info->comp_baddr)) { 8413476d George Cherian 2017-01-11 @470 dev_err(&pdev->dev, "mapping compptr Failed %lu\n", 8413476d George Cherian 2017-01-11 471 sizeof(union cpt_res_s)); 8413476d George Cherian 2017-01-11 472 ret = -EFAULT; 8413476d George Cherian 2017-01-11 473 goto request_cleanup; 8413476d George Cherian 2017-01-11 474 } 8413476d George Cherian 2017-01-11 475 8413476d George Cherian 2017-01-11 476 /* Fill the VQ command */ 8413476d George Cherian 2017-01-11 477 vq_cmd.cmd.u64 = 0; 8413476d George Cherian 2017-01-11 478 vq_cmd.cmd.s.opcode = cpu_to_be16(cpt_req->opcode.flags); :: The code at line 470 was first introduced by commit :: 8413476deed83359518ea36cc316f4669a8c521c drivers: crypto: Add the Virtual Function driver for CPT :: TO: George Cherian :: CC: 0day robot --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: x86-64: Maintain 16-byte stack alignment
On Wed, Jan 11, 2017 at 10:21:07PM -0800, Andy Lutomirski wrote: > On Tue, Jan 10, 2017 at 10:01 PM, Andy Lutomirski wrote: > > On Tue, Jan 10, 2017 at 8:35 PM, Herbert Xu > > wrote: > >> On Tue, Jan 10, 2017 at 08:17:17PM -0800, Linus Torvalds wrote: > >>> > >>> That said, I do think that the "don't assume stack alignment, do it by > >>> hand" may be the safer thing. Because who knows what the random rules > >>> will be on other architectures. > >> > >> Sure we can ban the use of attribute aligned on stacks. But > >> what about indirect uses through structures? For example, if > >> someone does > >> > >> struct foo { > >> } __attribute__ ((__aligned__(16))); > >> > >> int bar(...) > >> { > >> struct foo f; > >> > >> return baz(&f); > >> } > >> > >> then baz will end up with an unaligned argument. The worst part > >> is that it is not at all obvious to the person writing the function > >> bar. > > > > Linus, I'm starting to lean toward agreeing with Herbert here, except > > that we should consider making it conditional on having a silly GCC > > version. After all, the silly GCC versions are wasting space and time > > with alignment instructions no matter what we do, so this would just > > mean tweaking the asm and adding some kind of check_stack_alignment() > > helper to throw out a WARN_ONCE() if we miss one. The problem with > > making it conditional is that making pt_regs effectively live at a > > variable offset from %rsp is just nasty. > > So actually doing this is gross because we have calls from asm to C > all over the place. But... maybe we can automate all the testing. > Josh, how hard would it be to teach objtool to (if requested by an > option) check that stack frames with statically known size preserve > 16-byte stack alignment? > > I find it rather annoying that gcc before 4.8 malfunctions when it > sees __aligned__(16) on x86_64 kernels. Sigh. Just to clarify, I think you're asking if, for versions of gcc which don't support -mpreferred-stack-boundary=3, objtool can analyze all C functions to ensure their stacks are 16-byte aligned. It's certainly possible, but I don't see how that solves the problem. The stack will still be misaligned by entry code. Or am I missing something? -- Josh -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] crypto: testmgr - use calculated count for number of test vectors
When working on AES in CCM mode for ARM, my code passed the internal tcrypt test before I had even bothered to implement the AES-192 and AES-256 code paths, which is strange because the tcrypt does contain AES-192 and AES-256 test vectors for CCM. As it turned out, the define AES_CCM_ENC_TEST_VECTORS was out of sync with the actual number of test vectors, causing only the AES-128 ones to be executed. So get rid of the defines, and wrap the test vector references in a macro that calculates the number of vectors automatically. The following test vector counts were out of sync with the respective defines: BF_CTR_ENC_TEST_VECTORS 2 -> 3 BF_CTR_DEC_TEST_VECTORS 2 -> 3 TF_CTR_ENC_TEST_VECTORS 2 -> 3 TF_CTR_DEC_TEST_VECTORS 2 -> 3 SERPENT_CTR_ENC_TEST_VECTORS 2 -> 3 SERPENT_CTR_DEC_TEST_VECTORS 2 -> 3 AES_CCM_ENC_TEST_VECTORS 8 -> 14 AES_CCM_DEC_TEST_VECTORS 7 -> 17 AES_CCM_4309_ENC_TEST_VECTORS7 -> 23 AES_CCM_4309_DEC_TEST_VECTORS 10 -> 23 CAMELLIA_CTR_ENC_TEST_VECTORS2 -> 3 CAMELLIA_CTR_DEC_TEST_VECTORS2 -> 3 Signed-off-by: Ard Biesheuvel --- crypto/testmgr.c | 1033 +++--- crypto/testmgr.h | 272 +- 2 files changed, 204 insertions(+), 1101 deletions(-) diff --git a/crypto/testmgr.c b/crypto/testmgr.c index 44e888b0b041..b49d57826f4a 100644 --- a/crypto/testmgr.c +++ b/crypto/testmgr.c @@ -2251,30 +2251,23 @@ static int alg_test_null(const struct alg_test_desc *desc, return 0; } +#define __VECS(tv) { .vecs = tv, .count = ARRAY_SIZE(tv) } + /* Please keep this list sorted by algorithm name. */ static const struct alg_test_desc alg_test_descs[] = { { .alg = "ansi_cprng", .test = alg_test_cprng, .suite = { - .cprng = { - .vecs = ansi_cprng_aes_tv_template, - .count = ANSI_CPRNG_AES_TEST_VECTORS - } + .cprng = __VECS(ansi_cprng_aes_tv_template) } }, { .alg = "authenc(hmac(md5),ecb(cipher_null))", .test = alg_test_aead, .suite = { .aead = { - .enc = { - .vecs = hmac_md5_ecb_cipher_null_enc_tv_template, - .count = HMAC_MD5_ECB_CIPHER_NULL_ENC_TEST_VECTORS - }, - .dec = { - .vecs = hmac_md5_ecb_cipher_null_dec_tv_template, - .count = HMAC_MD5_ECB_CIPHER_NULL_DEC_TEST_VECTORS - } + .enc = __VECS(hmac_md5_ecb_cipher_null_enc_tv_template), + .dec = __VECS(hmac_md5_ecb_cipher_null_dec_tv_template) } } }, { @@ -2282,12 +2275,7 @@ static const struct alg_test_desc alg_test_descs[] = { .test = alg_test_aead, .suite = { .aead = { - .enc = { - .vecs = - hmac_sha1_aes_cbc_enc_tv_temp, - .count = - HMAC_SHA1_AES_CBC_ENC_TEST_VEC - } + .enc = __VECS(hmac_sha1_aes_cbc_enc_tv_temp) } } }, { @@ -2295,12 +2283,7 @@ static const struct alg_test_desc alg_test_descs[] = { .test = alg_test_aead, .suite = { .aead = { - .enc = { - .vecs = - hmac_sha1_des_cbc_enc_tv_temp, - .count = - HMAC_SHA1_DES_CBC_ENC_TEST_VEC - } + .enc = __VECS(hmac_sha1_des_cbc_enc_tv_temp) } } }, { @@ -2309,12 +2292,7 @@ static const struct alg_test_desc alg_test_descs[] = { .fips_allowed = 1, .suite = { .aead = { - .enc = { - .vecs = - hmac_sha1_des3_ede_cbc_enc_tv_temp, - .count = - HMAC_SHA1_DES3_EDE_CBC_ENC_TEST_VEC - } + .enc = __VECS(hmac_sha1_des3_ede_cbc_enc_tv_temp) } } },
Re: [PATCH v4 3/3] drivers: crypto: Enable CPT options crypto for build
Hi George, [auto build test WARNING on v4.9-rc8] [cannot apply to cryptodev/master crypto/master sparc-next/master next-20170111] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/George-Cherian/Add-Support-for-Cavium-Cryptographic-Acceleration-Unit/20170112-192240 coccinelle warnings: (new ones prefixed by >>) >> drivers/crypto/cavium/cpt/cptvf_reqmanager.c:312:2-8: WARNING: NULL check >> before freeing functions like kfree, debugfs_remove, >> debugfs_remove_recursive or usb_free_urb is not needed. Maybe consider >> reorganizing relevant code to avoid passing NULL values. drivers/crypto/cavium/cpt/cptvf_reqmanager.c:315:2-8: WARNING: NULL check before freeing functions like kfree, debugfs_remove, debugfs_remove_recursive or usb_free_urb is not needed. Maybe consider reorganizing relevant code to avoid passing NULL values. drivers/crypto/cavium/cpt/cptvf_reqmanager.c:318:2-8: WARNING: NULL check before freeing functions like kfree, debugfs_remove, debugfs_remove_recursive or usb_free_urb is not needed. Maybe consider reorganizing relevant code to avoid passing NULL values. drivers/crypto/cavium/cpt/cptvf_reqmanager.c:321:2-8: WARNING: NULL check before freeing functions like kfree, debugfs_remove, debugfs_remove_recursive or usb_free_urb is not needed. Maybe consider reorganizing relevant code to avoid passing NULL values. Please review and possibly fold the followup patch. --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] drivers: crypto: fix ifnullfree.cocci warnings
drivers/crypto/cavium/cpt/cptvf_reqmanager.c:312:2-8: WARNING: NULL check before freeing functions like kfree, debugfs_remove, debugfs_remove_recursive or usb_free_urb is not needed. Maybe consider reorganizing relevant code to avoid passing NULL values. drivers/crypto/cavium/cpt/cptvf_reqmanager.c:315:2-8: WARNING: NULL check before freeing functions like kfree, debugfs_remove, debugfs_remove_recursive or usb_free_urb is not needed. Maybe consider reorganizing relevant code to avoid passing NULL values. drivers/crypto/cavium/cpt/cptvf_reqmanager.c:318:2-8: WARNING: NULL check before freeing functions like kfree, debugfs_remove, debugfs_remove_recursive or usb_free_urb is not needed. Maybe consider reorganizing relevant code to avoid passing NULL values. drivers/crypto/cavium/cpt/cptvf_reqmanager.c:321:2-8: WARNING: NULL check before freeing functions like kfree, debugfs_remove, debugfs_remove_recursive or usb_free_urb is not needed. Maybe consider reorganizing relevant code to avoid passing NULL values. NULL check before some freeing functions is not needed. Based on checkpatch warning "kfree(NULL) is safe this check is probably not required" and kfreeaddr.cocci by Julia Lawall. Generated by: scripts/coccinelle/free/ifnullfree.cocci CC: George Cherian Signed-off-by: Fengguang Wu --- cptvf_reqmanager.c | 12 1 file changed, 4 insertions(+), 8 deletions(-) --- a/drivers/crypto/cavium/cpt/cptvf_reqmanager.c +++ b/drivers/crypto/cavium/cpt/cptvf_reqmanager.c @@ -308,17 +308,13 @@ void do_request_cleanup(struct cpt_vf *c } } - if (info->scatter_components) - kzfree(info->scatter_components); + kzfree(info->scatter_components); - if (info->gather_components) - kzfree(info->gather_components); + kzfree(info->gather_components); - if (info->out_buffer) - kzfree(info->out_buffer); + kzfree(info->out_buffer); - if (info->in_buffer) - kzfree(info->in_buffer); + kzfree(info->in_buffer); if (info->completion_addr) kzfree((void *)info->completion_addr); -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH 6/6] dm-crypt: Add bulk crypto processing support
This patch converts dm-crypt to use bulk requests when invoking skcipher operations, allowing the crypto drivers to process multiple sectors at once, while reducing the overhead caused by the small sector size. The new code detects if multiple sectors from a bio are contigously stored within a single page (which should almost always be the case), and in such case processes all these sectors via a single bulk request. Note that the bio can also consist of several (likely consecutive) pages, which could be all bundled in a single request. However, since we need to specify an upper bound on how many sectors we are going to send at once (and this bound may affect the amount of memory allocated per single request), it is best to just limit the request bundling to a single page. Note that if the 'keycount' parameter of the cipher specification is set to a value other than 1, dm-crypt still sends only one sector in each request, since in such case the neighboring sectors are encrypted with different keys. This change causes a detectable read/write speedup (about 5-10%) on a ramdisk when AES-NI accelerated ciphers are used. Signed-off-by: Ondrej Mosnacek --- drivers/md/dm-crypt.c | 254 -- 1 file changed, 165 insertions(+), 89 deletions(-) diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c index 7c6c572..d3f69e1 100644 --- a/drivers/md/dm-crypt.c +++ b/drivers/md/dm-crypt.c @@ -37,6 +37,9 @@ #define DM_MSG_PREFIX "crypt" +/* for now, we only bundle consecutve sectors within a single page */ +#define MAX_CONSEC_SECTORS (1 << (PAGE_SHIFT - SECTOR_SHIFT)) + /* * context holding the current state of a multi-part conversion */ @@ -48,7 +51,7 @@ struct convert_context { struct bvec_iter iter_out; sector_t cc_sector; atomic_t cc_pending; - struct skcipher_request *req; + struct skcipher_bulk_request *req; }; /* @@ -73,6 +76,7 @@ struct dm_crypt_request { struct scatterlist sg_in; struct scatterlist sg_out; sector_t iv_sector; + sector_t sector_count; }; struct crypt_config; @@ -83,9 +87,9 @@ struct crypt_iv_operations { void (*dtr)(struct crypt_config *cc); int (*init)(struct crypt_config *cc); int (*wipe)(struct crypt_config *cc); - int (*generator)(struct crypt_config *cc, u8 *iv, + int (*generator)(struct crypt_config *cc, u8 *iv, unsigned int sector, struct dm_crypt_request *dmreq); - int (*post)(struct crypt_config *cc, u8 *iv, + int (*post)(struct crypt_config *cc, u8 *iv, unsigned int sector, struct dm_crypt_request *dmreq); }; @@ -163,14 +167,14 @@ struct crypt_config { /* * Layout of each crypto request: * -* struct skcipher_request +* struct skcipher_bulk_request * context * padding * struct dm_crypt_request * padding -* IV +* IVs * -* The padding is added so that dm_crypt_request and the IV are +* The padding is added so that dm_crypt_request and the IVs are * correctly aligned. */ unsigned int dmreq_start; @@ -245,20 +249,24 @@ static struct crypto_skcipher *any_tfm(struct crypt_config *cc) * http://article.gmane.org/gmane.linux.kernel.device-mapper.dm-crypt/454 */ -static int crypt_iv_plain_gen(struct crypt_config *cc, u8 *iv, +static int crypt_iv_plain_gen(struct crypt_config *cc, u8 *ivs, unsigned int i, struct dm_crypt_request *dmreq) { + u8 *iv = ivs + i * cc->iv_size; + memset(iv, 0, cc->iv_size); - *(__le32 *)iv = cpu_to_le32(dmreq->iv_sector & 0x); + *(__le32 *)iv = cpu_to_le32((dmreq->iv_sector + i) & 0x); return 0; } -static int crypt_iv_plain64_gen(struct crypt_config *cc, u8 *iv, - struct dm_crypt_request *dmreq) +static int crypt_iv_plain64_gen(struct crypt_config *cc, u8 *ivs, + unsigned int i, struct dm_crypt_request *dmreq) { + u8 *iv = ivs + i * cc->iv_size; + memset(iv, 0, cc->iv_size); - *(__le64 *)iv = cpu_to_le64(dmreq->iv_sector); + *(__le64 *)iv = cpu_to_le64(dmreq->iv_sector + i); return 0; } @@ -410,13 +418,14 @@ static int crypt_iv_essiv_ctr(struct crypt_config *cc, struct dm_target *ti, return err; } -static int crypt_iv_essiv_gen(struct crypt_config *cc, u8 *iv, +static int crypt_iv_essiv_gen(struct crypt_config *cc, u8 *ivs, unsigned int i, struct dm_crypt_request *dmreq) { struct crypto_cipher *essiv_tfm = cc->iv_private; + u8 *iv = ivs + i * cc->iv_size; memset(iv, 0, cc->iv_size); - *(__le64 *)iv = cpu_to_le64(dmreq->iv_sector); + *(__le64 *)iv = cpu_to_le64(dmreq->iv_sector + i); crypto_cipher_encr
[RFC PATCH 1/6] crypto: skcipher - Add bulk request processing API
This patch adds bulk request processing to the skcipher interface. Specifically, it adds a new type of request ('skcipher_bulk_request'), which allows passing multiple independent messages to the skcipher driver. The buffers for the message data are passed via just two sg lists (one for src buffer, one for dst buffer). The IVs are passed via a single buffer, where they are stored sequentially. The interface allows specifying either a fixed length for all messages or a pointer to an array of message lengths. A skcipher implementation that wants to provide support for bulk requests may set the appropriate fields of its skcipher_alg struct. If these fields are not provided (or the skcipher is created from an (a)blkcipher), the crypto API automatically sets these fields to a fallback implementation, which just splits the bulk request into a series of regular skcipher requests on the same tfm. This means that the new type of request can be used with all skciphers, even if they do not support bulk requests natively. Note that when allocating a skcipher_bulk_request, the user must specify the maximum number of messages that they are going to submit via the request. This is necessary for the fallback implementation, which has to allocate space for the appropriate number of subrequests so that they can be processed in parallel. If the skcipher is synchronous, then the fallback implementation only allocates space for one subrequest and processes the patrial requests sequentially. Signed-off-by: Ondrej Mosnacek --- crypto/Makefile| 1 + crypto/skcipher.c | 15 ++ crypto/skcipher_bulk.c | 312 + include/crypto/internal/skcipher.h | 32 include/crypto/skcipher.h | 299 ++- 5 files changed, 658 insertions(+), 1 deletion(-) create mode 100644 crypto/skcipher_bulk.c diff --git a/crypto/Makefile b/crypto/Makefile index b8f0e3e..cd1cf57 100644 --- a/crypto/Makefile +++ b/crypto/Makefile @@ -19,6 +19,7 @@ obj-$(CONFIG_CRYPTO_AEAD2) += aead.o crypto_blkcipher-y := ablkcipher.o crypto_blkcipher-y += blkcipher.o crypto_blkcipher-y += skcipher.o +crypto_blkcipher-y += skcipher_bulk.o obj-$(CONFIG_CRYPTO_BLKCIPHER2) += crypto_blkcipher.o obj-$(CONFIG_CRYPTO_SEQIV) += seqiv.o obj-$(CONFIG_CRYPTO_ECHAINIV) += echainiv.o diff --git a/crypto/skcipher.c b/crypto/skcipher.c index 6ee6a15..8b6d684 100644 --- a/crypto/skcipher.c +++ b/crypto/skcipher.c @@ -667,6 +667,8 @@ static int crypto_init_skcipher_ops_blkcipher(struct crypto_tfm *tfm) skcipher->ivsize = crypto_blkcipher_ivsize(blkcipher); skcipher->keysize = calg->cra_blkcipher.max_keysize; + crypto_skcipher_bulk_set_fallback(skcipher); + return 0; } @@ -760,6 +762,8 @@ static int crypto_init_skcipher_ops_ablkcipher(struct crypto_tfm *tfm) sizeof(struct ablkcipher_request); skcipher->keysize = calg->cra_ablkcipher.max_keysize; + crypto_skcipher_bulk_set_fallback(skcipher); + return 0; } @@ -789,6 +793,14 @@ static int crypto_skcipher_init_tfm(struct crypto_tfm *tfm) skcipher->ivsize = alg->ivsize; skcipher->keysize = alg->max_keysize; + if (!alg->encrypt_bulk || !alg->decrypt_bulk || !alg->reqsize_bulk) + crypto_skcipher_bulk_set_fallback(skcipher); + else { + skcipher->encrypt_bulk = alg->encrypt_bulk; + skcipher->decrypt_bulk = alg->decrypt_bulk; + skcipher->reqsize_bulk = alg->reqsize_bulk; + } + if (alg->exit) skcipher->base.exit = crypto_skcipher_exit_tfm; @@ -822,6 +834,9 @@ static void crypto_skcipher_show(struct seq_file *m, struct crypto_alg *alg) seq_printf(m, "ivsize : %u\n", skcipher->ivsize); seq_printf(m, "chunksize: %u\n", skcipher->chunksize); seq_printf(m, "walksize : %u\n", skcipher->walksize); + seq_printf(m, "bulk : %s\n", + (skcipher->encrypt_bulk && skcipher->decrypt_bulk && + skcipher->reqsize_bulk) ? "yes" : "no"); } #ifdef CONFIG_NET diff --git a/crypto/skcipher_bulk.c b/crypto/skcipher_bulk.c new file mode 100644 index 000..9630122 --- /dev/null +++ b/crypto/skcipher_bulk.c @@ -0,0 +1,312 @@ +/* + * Bulk IV fallback for skcipher. + * + * Copyright (C) 2016-2017 Red Hat, Inc. All rights reserved. + * Copyright (c) 2016-2017 Ondrej Mosnacek + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the Free + * Software Foundation; either version 2 of the License, or (at your option) + * any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +struct skcipher_bulk_subreqctx { + struct scatterlist sg_src[2]; + struct scatterlist sg_dst[2];
[RFC PATCH 4/6] crypto: simd - Add bulk request support
This patch adds proper support for the new bulk requests to the SIMD helpers. Signed-off-by: Ondrej Mosnacek --- crypto/simd.c | 61 +++ 1 file changed, 61 insertions(+) diff --git a/crypto/simd.c b/crypto/simd.c index 8820337..2ae5930 100644 --- a/crypto/simd.c +++ b/crypto/simd.c @@ -100,6 +100,64 @@ static int simd_skcipher_decrypt(struct skcipher_request *req) return crypto_skcipher_decrypt(subreq); } +static int simd_skcipher_encrypt_bulk(struct skcipher_bulk_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_bulk_reqtfm(req); + struct simd_skcipher_ctx *ctx = crypto_skcipher_ctx(tfm); + struct skcipher_bulk_request *subreq; + struct crypto_skcipher *child; + + subreq = skcipher_bulk_request_ctx(req); + *subreq = *req; + + if (!may_use_simd() || + (in_atomic() && cryptd_skcipher_queued(ctx->cryptd_tfm))) + child = &ctx->cryptd_tfm->base; + else + child = cryptd_skcipher_child(ctx->cryptd_tfm); + + skcipher_bulk_request_set_tfm(subreq, child); + + return crypto_skcipher_encrypt_bulk(subreq); +} + +static int simd_skcipher_decrypt_bulk(struct skcipher_bulk_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_bulk_reqtfm(req); + struct simd_skcipher_ctx *ctx = crypto_skcipher_ctx(tfm); + struct skcipher_bulk_request *subreq; + struct crypto_skcipher *child; + + subreq = skcipher_bulk_request_ctx(req); + *subreq = *req; + + if (!may_use_simd() || + (in_atomic() && cryptd_skcipher_queued(ctx->cryptd_tfm))) + child = &ctx->cryptd_tfm->base; + else + child = cryptd_skcipher_child(ctx->cryptd_tfm); + + skcipher_bulk_request_set_tfm(subreq, child); + + return crypto_skcipher_decrypt_bulk(subreq); +} + +static unsigned int simd_skcipher_reqsize_bulk(struct crypto_skcipher *tfm, + unsigned int maxmsgs) +{ + struct simd_skcipher_ctx *ctx = crypto_skcipher_ctx(tfm); + struct crypto_skcipher *tfm_cryptd, *tfm_child; + unsigned int reqsize_cryptd, reqsize_child; + + tfm_cryptd = &ctx->cryptd_tfm->base; + tfm_child = cryptd_skcipher_child(ctx->cryptd_tfm); + + reqsize_cryptd = crypto_skcipher_bulk_reqsize(tfm_cryptd, maxmsgs); + reqsize_child = crypto_skcipher_bulk_reqsize(tfm_child, maxmsgs); + return sizeof(struct skcipher_bulk_request) + + max(reqsize_cryptd, reqsize_child); +} + static void simd_skcipher_exit(struct crypto_skcipher *tfm) { struct simd_skcipher_ctx *ctx = crypto_skcipher_ctx(tfm); @@ -187,6 +245,9 @@ struct simd_skcipher_alg *simd_skcipher_create_compat(const char *algname, alg->setkey = simd_skcipher_setkey; alg->encrypt = simd_skcipher_encrypt; alg->decrypt = simd_skcipher_decrypt; + alg->encrypt_bulk = simd_skcipher_encrypt_bulk; + alg->decrypt_bulk = simd_skcipher_decrypt_bulk; + alg->reqsize_bulk = simd_skcipher_reqsize_bulk; err = crypto_register_skcipher(alg); if (err) -- 2.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH 3/6] crypto: cryptd - Add skcipher bulk request support
This patch adds proper support for the new bulk requests to cryptd. Signed-off-by: Ondrej Mosnacek --- crypto/cryptd.c | 111 1 file changed, 111 insertions(+) diff --git a/crypto/cryptd.c b/crypto/cryptd.c index 0508c48..b7d6e13 100644 --- a/crypto/cryptd.c +++ b/crypto/cryptd.c @@ -555,6 +555,114 @@ static int cryptd_skcipher_decrypt_enqueue(struct skcipher_request *req) return cryptd_skcipher_enqueue(req, cryptd_skcipher_decrypt); } +static void cryptd_skcipher_bulk_complete(struct skcipher_bulk_request *req, + int err) +{ + struct crypto_skcipher *tfm = crypto_skcipher_bulk_reqtfm(req); + struct cryptd_skcipher_ctx *ctx = crypto_skcipher_ctx(tfm); + struct cryptd_skcipher_request_ctx *rctx = + skcipher_bulk_request_ctx(req); + int refcnt = atomic_read(&ctx->refcnt); + + local_bh_disable(); + rctx->complete(&req->base, err); + local_bh_enable(); + + if (err != -EINPROGRESS && refcnt && atomic_dec_and_test(&ctx->refcnt)) + crypto_free_skcipher(tfm); +} + +static void cryptd_skcipher_bulk_encrypt(struct crypto_async_request *base, +int err) +{ + struct skcipher_bulk_request *req = skcipher_bulk_request_cast(base); + struct cryptd_skcipher_request_ctx *rctx = + skcipher_bulk_request_ctx(req); + struct crypto_skcipher *tfm = crypto_skcipher_bulk_reqtfm(req); + struct cryptd_skcipher_ctx *ctx = crypto_skcipher_ctx(tfm); + struct crypto_skcipher *child = ctx->child; + SKCIPHER_BULK_REQUEST_ON_STACK(subreq, req->maxmsgs, child); + + if (unlikely(err == -EINPROGRESS)) + goto out; + + skcipher_bulk_request_set_tfm(subreq, child); + skcipher_bulk_request_set_callback(subreq, CRYPTO_TFM_REQ_MAY_SLEEP, + NULL, NULL); + skcipher_bulk_request_set_crypt(subreq, req->src, req->dst, req->nmsgs, + req->msgsize, req->msgsizes, req->ivs); + + err = crypto_skcipher_encrypt_bulk(subreq); + skcipher_bulk_request_zero(subreq); + + req->base.complete = rctx->complete; + +out: + cryptd_skcipher_bulk_complete(req, err); +} + +static void cryptd_skcipher_bulk_decrypt(struct crypto_async_request *base, +int err) +{ + struct skcipher_bulk_request *req = skcipher_bulk_request_cast(base); + struct cryptd_skcipher_request_ctx *rctx = + skcipher_bulk_request_ctx(req); + struct crypto_skcipher *tfm = crypto_skcipher_bulk_reqtfm(req); + struct cryptd_skcipher_ctx *ctx = crypto_skcipher_ctx(tfm); + struct crypto_skcipher *child = ctx->child; + SKCIPHER_BULK_REQUEST_ON_STACK(subreq, req->maxmsgs, child); + + if (unlikely(err == -EINPROGRESS)) + goto out; + + skcipher_bulk_request_set_tfm(subreq, child); + skcipher_bulk_request_set_callback(subreq, CRYPTO_TFM_REQ_MAY_SLEEP, + NULL, NULL); + skcipher_bulk_request_set_crypt(subreq, req->src, req->dst, req->nmsgs, + req->msgsize, req->msgsizes, req->ivs); + + err = crypto_skcipher_decrypt_bulk(subreq); + skcipher_bulk_request_zero(subreq); + + req->base.complete = rctx->complete; + +out: + cryptd_skcipher_bulk_complete(req, err); +} + +static int cryptd_skcipher_bulk_enqueue(struct skcipher_bulk_request *req, + crypto_completion_t compl) +{ + struct cryptd_skcipher_request_ctx *rctx = + skcipher_bulk_request_ctx(req); + struct crypto_skcipher *tfm = crypto_skcipher_bulk_reqtfm(req); + struct cryptd_queue *queue; + + queue = cryptd_get_queue(crypto_skcipher_tfm(tfm)); + rctx->complete = req->base.complete; + req->base.complete = compl; + + return cryptd_enqueue_request(queue, &req->base); +} + +static int cryptd_skcipher_bulk_encrypt_enqueue( + struct skcipher_bulk_request *req) +{ + return cryptd_skcipher_bulk_enqueue(req, cryptd_skcipher_bulk_encrypt); +} + +static int cryptd_skcipher_bulk_decrypt_enqueue( + struct skcipher_bulk_request *req) +{ + return cryptd_skcipher_bulk_enqueue(req, cryptd_skcipher_bulk_decrypt); +} + +static unsigned int cryptd_skcipher_bulk_reqsize(struct crypto_skcipher *tfm, +unsigned int maxmsgs) +{ + return sizeof(struct cryptd_skcipher_request_ctx); +} + static int cryptd_skcipher_init_tfm(struct crypto_skcipher *tfm) { struct skcipher_instance *inst = skcipher_alg_instance(tfm); @@ -641,6 +749,9 @@ static int cryptd_create_skcipher(struct crypto_template *tmpl, inst->alg.setkey = cryptd_skci
[RFC PATCH 0/6] Add bulk skcipher requests to crypto API and dm-crypt
Hi, the goal of this patchset is to allow those skcipher API users that need to process batches of small messages (especially dm-crypt) to do so efficiently. The first patch introduces a new request type (and corresponding encrypt/decrypt functions) to the skcipher API. The new API can be used to submit multiple messages at once, thus enabling the drivers to reduce overhead as opposed to processing each message separately. The skcipher drivers can provide support for the new request type by setting the corresponding fields of their skcipher_alg structure. If 'native' support is not provided by a driver (i.e. the fields are left NULL), the crypto API transparently provides a generic fallback implementation, which simply processes the bulk request as a set of standard requests on the same tfm. The second patch extends skcipher_walk so it can be used for processing the new bulk requests, while preserving equivalent functionality when used with standard requests. The third and fourth patches add native bulk request support to the cryptd and SIMD helper wrappers, respectively. The fifth patch adds bulk request support to the AES-NI skcipher drivers, in order to provide an example for both implementing the bulk request processing and the usage of the extended skcipher_walk in such implementation. Also, this patch provides a slight optimization, since the kernel_fpu_* functions are called just once per the whole bulk request. Note that both the standard and bulk implementation mostly use the same code under the hood. The last patch converts dm-crypt to use bulk requests and makes it submit multiple sectors at once, whenever they are stored sequentially within a single page. With all the patches applied, I was able to measure a small speedup (~5-10%) with AES-NI ciphers and dm-crypt device mapped over a ramdisk. To-be-done: testing the bulk API in testmgr.c documentation update Ondrej Mosnacek (6): crypto: skcipher - Add bulk request processing API crypto: skcipher - Add bulk request support to walk crypto: cryptd - Add skcipher bulk request support crypto: simd - Add bulk request support crypto: aesni-intel - Add bulk request support dm-crypt: Add bulk crypto processing support arch/x86/crypto/aesni-intel_glue.c| 267 +++-- arch/x86/crypto/glue_helper.c | 23 +-- arch/x86/include/asm/crypto/glue_helper.h | 2 +- crypto/Makefile | 1 + crypto/cryptd.c | 111 +++ crypto/simd.c | 61 ++ crypto/skcipher.c | 207 +++- crypto/skcipher_bulk.c| 312 ++ drivers/md/dm-crypt.c | 254 +++- include/crypto/internal/skcipher.h| 42 +++- include/crypto/skcipher.h | 299 +++- 11 files changed, 1369 insertions(+), 210 deletions(-) create mode 100644 crypto/skcipher_bulk.c -- 2.9.3 -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH 5/6] crypto: aesni-intel - Add bulk request support
This patch implements bulk request handling in the AES-NI crypto drivers. The major advantage of this is that with bulk requests, the kernel_fpu_* functions (which are usually quite slow) are now called only once for the whole request. Signed-off-by: Ondrej Mosnacek --- arch/x86/crypto/aesni-intel_glue.c| 267 +++--- arch/x86/crypto/glue_helper.c | 23 ++- arch/x86/include/asm/crypto/glue_helper.h | 2 +- 3 files changed, 221 insertions(+), 71 deletions(-) diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c index 36ca150..5f67afc 100644 --- a/arch/x86/crypto/aesni-intel_glue.c +++ b/arch/x86/crypto/aesni-intel_glue.c @@ -364,70 +364,116 @@ static int aesni_skcipher_setkey(struct crypto_skcipher *tfm, const u8 *key, crypto_skcipher_ctx(tfm), key, len); } -static int ecb_encrypt(struct skcipher_request *req) +typedef void (*aesni_crypt_t)(struct crypto_aes_ctx *ctx, + u8 *out, const u8 *in, unsigned int len); + +typedef void (*aesni_ivcrypt_t)(struct crypto_aes_ctx *ctx, + u8 *out, const u8 *in, unsigned int len, + u8 *iv); + +static int ecb_crypt(struct crypto_aes_ctx *ctx, struct skcipher_walk *walk, +aesni_crypt_t crypt) { - struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); - struct crypto_aes_ctx *ctx = aes_ctx(crypto_skcipher_ctx(tfm)); - struct skcipher_walk walk; unsigned int nbytes; int err; - err = skcipher_walk_virt(&walk, req, true); - kernel_fpu_begin(); - while ((nbytes = walk.nbytes)) { - aesni_ecb_enc(ctx, walk.dst.virt.addr, walk.src.virt.addr, - nbytes & AES_BLOCK_MASK); + while ((nbytes = walk->nbytes)) { + crypt(ctx, walk->dst.virt.addr, walk->src.virt.addr, + nbytes & AES_BLOCK_MASK); nbytes &= AES_BLOCK_SIZE - 1; - err = skcipher_walk_done(&walk, nbytes); + err = skcipher_walk_done(walk, nbytes); } kernel_fpu_end(); return err; } -static int ecb_decrypt(struct skcipher_request *req) +static int cbc_crypt(struct crypto_aes_ctx *ctx, struct skcipher_walk *walk, +aesni_ivcrypt_t crypt) { - struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); - struct crypto_aes_ctx *ctx = aes_ctx(crypto_skcipher_ctx(tfm)); - struct skcipher_walk walk; unsigned int nbytes; int err; - err = skcipher_walk_virt(&walk, req, true); - kernel_fpu_begin(); - while ((nbytes = walk.nbytes)) { - aesni_ecb_dec(ctx, walk.dst.virt.addr, walk.src.virt.addr, - nbytes & AES_BLOCK_MASK); + while ((nbytes = walk->nbytes)) { + crypt(ctx, walk->dst.virt.addr, walk->src.virt.addr, + nbytes & AES_BLOCK_MASK, walk->iv); nbytes &= AES_BLOCK_SIZE - 1; - err = skcipher_walk_done(&walk, nbytes); + err = skcipher_walk_done(walk, nbytes); } kernel_fpu_end(); return err; } -static int cbc_encrypt(struct skcipher_request *req) +static int ecb_encrypt(struct skcipher_request *req) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); struct crypto_aes_ctx *ctx = aes_ctx(crypto_skcipher_ctx(tfm)); struct skcipher_walk walk; - unsigned int nbytes; int err; err = skcipher_walk_virt(&walk, req, true); + if (err) + return err; - kernel_fpu_begin(); - while ((nbytes = walk.nbytes)) { - aesni_cbc_enc(ctx, walk.dst.virt.addr, walk.src.virt.addr, - nbytes & AES_BLOCK_MASK, walk.iv); - nbytes &= AES_BLOCK_SIZE - 1; - err = skcipher_walk_done(&walk, nbytes); - } - kernel_fpu_end(); + return ecb_crypt(ctx, &walk, aesni_ecb_enc); +} - return err; +static int ecb_decrypt(struct skcipher_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + struct crypto_aes_ctx *ctx = aes_ctx(crypto_skcipher_ctx(tfm)); + struct skcipher_walk walk; + int err; + + err = skcipher_walk_virt(&walk, req, true); + if (err) + return err; + + return ecb_crypt(ctx, &walk, aesni_ecb_dec); +} + +static int ecb_encrypt_bulk(struct skcipher_bulk_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_bulk_reqtfm(req); + struct crypto_aes_ctx *ctx = aes_ctx(crypto_skcipher_ctx(tfm)); + struct skcipher_walk walk; + int err; + + err = skcipher_walk_virt_bulk(&walk, req, true); + if (err) + return err; + + return ecb_crypt(ctx, &walk, aesni_ecb_enc); +} + +static int ecb_decrypt_bulk(s
[RFC PATCH 2/6] crypto: skcipher - Add bulk request support to walk
This patch tweaks skcipher_walk so it can be used with the new bulk requests. The new skipher_walk can be initialized either from a skcipher_request (in which case its behavior is equivalent to the old code) or from a skcipher_bulk_request, in which case the usage is almost identical, the most significant exception being that skciphers which somehow tweak the IV (e.g. XTS) must check the new nextmsg flag before processing each chunk and re-tweak the IV if it is set. For other ciphers skcipher_walk automatically switches to the next IV at message boundaries. Signed-off-by: Ondrej Mosnacek --- crypto/skcipher.c | 192 +++-- include/crypto/internal/skcipher.h | 10 +- 2 files changed, 153 insertions(+), 49 deletions(-) diff --git a/crypto/skcipher.c b/crypto/skcipher.c index 8b6d684..b810e90 100644 --- a/crypto/skcipher.c +++ b/crypto/skcipher.c @@ -33,6 +33,7 @@ enum { SKCIPHER_WALK_COPY = 1 << 2, SKCIPHER_WALK_DIFF = 1 << 3, SKCIPHER_WALK_SLEEP = 1 << 4, + SKCIPHER_WALK_HETEROGENOUS = 1 << 5, }; struct skcipher_walk_buffer { @@ -94,6 +95,41 @@ static inline u8 *skcipher_get_spot(u8 *start, unsigned int len) return max(start, end_page); } +static int skcipher_copy_iv(struct skcipher_walk *walk) +{ + unsigned a = crypto_tfm_ctx_alignment() - 1; + unsigned alignmask = walk->alignmask; + unsigned ivsize = walk->ivsize; + unsigned bs = walk->stride; + unsigned aligned_bs; + unsigned size; + u8 *iv; + + aligned_bs = ALIGN(bs, alignmask); + + /* Minimum size to align buffer by alignmask. */ + size = alignmask & ~a; + + if (walk->flags & SKCIPHER_WALK_PHYS) + size += ivsize; + else { + size += aligned_bs + ivsize; + + /* Minimum size to ensure buffer does not straddle a page. */ + size += (bs - 1) & ~(alignmask | a); + } + + walk->buffer = kmalloc(size, skcipher_walk_gfp(walk)); + if (!walk->buffer) + return -ENOMEM; + + iv = PTR_ALIGN(walk->buffer, alignmask + 1); + iv = skcipher_get_spot(iv, bs) + aligned_bs; + + walk->iv = memcpy(iv, walk->iv, walk->ivsize); + return 0; +} + static int skcipher_done_slow(struct skcipher_walk *walk, unsigned int bsize) { u8 *addr; @@ -108,9 +144,12 @@ static int skcipher_done_slow(struct skcipher_walk *walk, unsigned int bsize) int skcipher_walk_done(struct skcipher_walk *walk, int err) { unsigned int n = walk->nbytes - err; - unsigned int nbytes; + unsigned int nbytes, nbytes_msg; + + walk->nextmsg = false; /* reset the nextmsg flag */ nbytes = walk->total - n; + nbytes_msg = walk->total_msg - n; if (unlikely(err < 0)) { nbytes = 0; @@ -139,8 +178,31 @@ int skcipher_walk_done(struct skcipher_walk *walk, int err) if (err > 0) err = 0; + if (nbytes && !nbytes_msg) { + walk->nextmsg = true; + + /* write the output IV: */ + if (walk->iv != walk->oiv) + memcpy(walk->oiv, walk->iv, walk->ivsize); + + /* advance to the IV of next message: */ + walk->oiv += walk->ivsize; + walk->iv = walk->oiv; + + if (unlikely(((unsigned long)walk->iv & walk->alignmask))) { + err = skcipher_copy_iv(walk); + if (err) + return err; + } + + nbytes_msg = *walk->nextmsgsize; + if (walk->flags & SKCIPHER_WALK_HETEROGENOUS) + ++walk->nextmsgsize; + } + + walk->nbytes = nbytes_msg; + walk->total_msg = nbytes_msg; walk->total = nbytes; - walk->nbytes = nbytes; scatterwalk_advance(&walk->in, n); scatterwalk_advance(&walk->out, n); @@ -343,13 +405,13 @@ static int skcipher_walk_next(struct skcipher_walk *walk) walk->flags &= ~(SKCIPHER_WALK_SLOW | SKCIPHER_WALK_COPY | SKCIPHER_WALK_DIFF); - n = walk->total; + n = walk->total_msg; bsize = min(walk->stride, max(n, walk->blocksize)); n = scatterwalk_clamp(&walk->in, n); n = scatterwalk_clamp(&walk->out, n); if (unlikely(n < bsize)) { - if (unlikely(walk->total < walk->blocksize)) + if (unlikely(walk->total_msg < walk->blocksize)) return skcipher_walk_done(walk, -EINVAL); slow_path: @@ -388,41 +450,6 @@ static int skcipher_walk_next(struct skcipher_walk *walk) } EXPORT_SYMBOL_GPL(skcipher_walk_next); -static int skcipher_copy_iv(struct skcipher_walk *walk) -{ - unsigned a = crypto_tfm_ctx_alignment() - 1; - unsigned alignmask = walk->alignmask; - unsigned ivsize = walk->ivsize; - unsigned bs = walk->stride; -
Re: [PATCH 00/13] crypto: copy AAD during encrypt for AEAD ciphers
Am Donnerstag, 12. Januar 2017, 14:19:36 CET schrieb Herbert Xu: Hi Herbert, > > I think it's too much churn. Let's get the algif_aead code fixed > up first, and then pursue this later. To eliminate the extra churn of having all AEAD implementations changed to invoke copy operation, what about adding the callback to crypto_aead_encrypt? This way, all AEAD implementations benefit from it without having an extra call added to each of them? Ciao Stephan -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 3/3] drivers: crypto: Enable CPT options crypto for build
Hi George, [auto build test ERROR on v4.9-rc8] [cannot apply to cryptodev/master crypto/master sparc-next/master next-20170111] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/George-Cherian/Add-Support-for-Cavium-Cryptographic-Acceleration-Unit/20170112-192240 config: blackfin-allmodconfig (attached as .config) compiler: bfin-uclinux-gcc (GCC) 6.2.0 reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=blackfin All error/warnings (new ones prefixed by >>): In file included from drivers/crypto/cavium/cpt/cptvf.h:13:0, from drivers/crypto/cavium/cpt/cptvf_main.c:12: drivers/crypto/cavium/cpt/cpt_common.h: In function 'cpt_write_csr64': >> drivers/crypto/cavium/cpt/cpt_common.h:151:2: error: implicit declaration of >> function 'writeq' [-Werror=implicit-function-declaration] writeq(val, hw_addr + offset); ^~ drivers/crypto/cavium/cpt/cpt_common.h: In function 'cpt_read_csr64': >> drivers/crypto/cavium/cpt/cpt_common.h:156:9: error: implicit declaration of >> function 'readq' [-Werror=implicit-function-declaration] return readq(hw_addr + offset); ^ In file included from drivers/crypto/cavium/cpt/cptvf_main.c:12:0: drivers/crypto/cavium/cpt/cptvf.h: At top level: >> drivers/crypto/cavium/cpt/cptvf.h:111:20: error: array type has incomplete >> element type 'struct msix_entry' struct msix_entry msix_entries[CPT_VF_MSIX_VECTORS]; ^~~~ drivers/crypto/cavium/cpt/cptvf_main.c: In function 'init_worker_threads': >> drivers/crypto/cavium/cpt/cptvf_main.c:52:9: warning: cast from pointer to >> integer of different size [-Wpointer-to-int-cast] (u64)cwqe_info); ^ drivers/crypto/cavium/cpt/cptvf_main.c: In function 'cptvf_disable_msix': >> drivers/crypto/cavium/cpt/cptvf_main.c:375:3: error: implicit declaration of >> function 'pci_disable_msix' [-Werror=implicit-function-declaration] pci_disable_msix(cptvf->pdev); ^~~~ drivers/crypto/cavium/cpt/cptvf_main.c: In function 'cptvf_enable_msix': >> drivers/crypto/cavium/cpt/cptvf_main.c:387:8: error: implicit declaration of >> function 'pci_enable_msix' [-Werror=implicit-function-declaration] ret = pci_enable_msix(cptvf->pdev, cptvf->msix_entries, ^~~ drivers/crypto/cavium/cpt/cptvf_main.c: At top level: >> drivers/crypto/cavium/cpt/cptvf_main.c:942:1: warning: data definition has >> no type or storage class module_pci_driver(cptvf_pci_driver); ^ >> drivers/crypto/cavium/cpt/cptvf_main.c:942:1: error: type defaults to 'int' >> in declaration of 'module_pci_driver' [-Werror=implicit-int] >> drivers/crypto/cavium/cpt/cptvf_main.c:942:1: warning: parameter names >> (without types) in function declaration drivers/crypto/cavium/cpt/cptvf_main.c:934:26: warning: 'cptvf_pci_driver' defined but not used [-Wunused-variable] static struct pci_driver cptvf_pci_driver = { ^~~~ cc1: some warnings being treated as errors -- In file included from drivers/crypto/cavium/cpt/cptvf.h:13:0, from drivers/crypto/cavium/cpt/cptvf_reqmanager.c:9: drivers/crypto/cavium/cpt/cpt_common.h: In function 'cpt_write_csr64': >> drivers/crypto/cavium/cpt/cpt_common.h:151:2: error: implicit declaration of >> function 'writeq' [-Werror=implicit-function-declaration] writeq(val, hw_addr + offset); ^~ drivers/crypto/cavium/cpt/cpt_common.h: In function 'cpt_read_csr64': >> drivers/crypto/cavium/cpt/cpt_common.h:156:9: error: implicit declaration of >> function 'readq' [-Werror=implicit-function-declaration] return readq(hw_addr + offset); ^ In file included from drivers/crypto/cavium/cpt/cptvf_reqmanager.c:9:0: drivers/crypto/cavium/cpt/cptvf.h: At top level: >> drivers/crypto/cavium/cpt/cptvf.h:111:20: error: array type has incomplete >> element type 'struct msix_entry' struct msix_entry msix_entries[CPT_VF_MSIX_VECTORS]; ^~~~ cc1: some warnings being treated as errors -- In file included from drivers/crypto/cavium/cpt/cptpf.h:12:0, from drivers/crypto/cavium/cpt/cptpf_main.c:18: drivers/crypto/cavium/cpt/cpt_common.h: In function 'cpt_write_csr64': >> driv
Re: [PATCH 00/13] crypto: copy AAD during encrypt for AEAD ciphers
Am Donnerstag, 12. Januar 2017, 14:19:36 CET schrieb Herbert Xu: Hi Herbert, > On Tue, Jan 10, 2017 at 02:36:21AM +0100, Stephan Müller wrote: > > to all driver maintainers: the patches I added are compile tested, but > > I do not have the hardware to verify the code. May I ask the respective > > hardware maintainers to verify that the code is appropriate and works > > as intended? Thanks a lot. > > > > Herbert, this is my proprosal for our discussion around copying the > > AAD for algif_aead. Instead of adding the code to algif_aead and wait > > until it transpires to all cipher implementations, I thought it would > > be more helpful to fix all cipher implementations. > > I think it's too much churn. Let's get the algif_aead code fixed > up first, and then pursue this later. My idea with this patch set was to have only a minimal change to any AEAD implementation, i.e. one callback to address this issue. When addressing the issue in the algif_aead code, and expect that over time the AEAD implementations will gain the copy operation, eventually we will copy the AAD twice. Of course, this could be prevented, if the algif_aead code somehow uses the same SGL for the src and dst AAD. Some time back I published the patch "crypto: AF_ALG - disregard AAD buffer space for output". This patch tried changing the src and dst SGL to remove the AAD. Considering this patch trying to change the src and dst SGL structure, I expect that the patch to algif_aead to have a common src/dst SGL for the AAD to prevent the AAD copy from the AEAD implementations is similar in complexity. Weighing the complexity of such temporary band-aid for algif_aead with the addition of one callback to each AEAD implementation (which would need to be added some when anyway), I thought it is less complex to add the one callback to the AEAD implementations. > > BTW, why are you only doing the copy for encryption? I was looking at the only AEAD implementation that does the copy operation: authenc. There, the copy operation is only performed for encryption. I was thinking a bit about why decryption was not covered. I think the answer is the following: for encryption, the AAD is definitely needed in the dst buffer as the dst buffer with the AAD must be sent to the recipient for decryption. The decryption and the associated authentication only works with the AAD. However, after decrypting, all the caller wants is the decrypted plaintext only. There is no further use of the AAD after the decryption step. Hence, copying the AAD to the dst buffer in the decryption step would not serve the caller. Ciao Stephan -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 2/3] drivers: crypto: Add the Virtual Function driver for CPT
Hi Stephan, Thank you for the clarification. Regards, -George On 01/12/2017 04:49 PM, Stephan Müller wrote: Am Donnerstag, 12. Januar 2017, 16:40:32 CET schrieb George Cherian: Hi George, Sure, please do not forget to invoke xts_verify_key. Should I be using xts_check_key or xts_verify_key? Both are identical except for the input parameter -- the one requires crypto_skcipher, the other crypto_tfm. Depending what pointer you have handy in your setkey function, you would use the most appropriate one. Ciao Stephan -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 2/3] drivers: crypto: Add the Virtual Function driver for CPT
Am Donnerstag, 12. Januar 2017, 16:40:32 CET schrieb George Cherian: Hi George, > > > > Sure, please do not forget to invoke xts_verify_key. > > Should I be using xts_check_key or xts_verify_key? Both are identical except for the input parameter -- the one requires crypto_skcipher, the other crypto_tfm. Depending what pointer you have handy in your setkey function, you would use the most appropriate one. Ciao Stephan -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 2/3] drivers: crypto: Add the Virtual Function driver for CPT
Hi Stephan, On 01/11/2017 06:09 PM, Stephan Müller wrote: Am Mittwoch, 11. Januar 2017, 16:58:17 CET schrieb George Cherian: Hi George, I will add a seperate function for xts setkey and make changes as following. ... + +struct crypto_alg algs[] = { { + .cra_flags = CRYPTO_ALG_TYPE_ABLKCIPHER | CRYPTO_ALG_ASYNC, + .cra_blocksize = AES_BLOCK_SIZE, + .cra_ctxsize = sizeof(struct cvm_enc_ctx), + .cra_alignmask = 7, + .cra_priority = 4001, + .cra_name = "xts(aes)", + .cra_driver_name = "cavium-xts-aes", + .cra_type = &crypto_ablkcipher_type, + .cra_u = { + .ablkcipher = { + .ivsize = AES_BLOCK_SIZE, + .min_keysize = AES_MIN_KEY_SIZE, + .max_keysize = AES_MAX_KEY_SIZE, + .setkey = cvm_enc_dec_setkey, May I ask how the setkey for XTS is intended to work? The XTS keys are double in size than "normal" keys. .ablkcipher = { .ivsize = AES_BLOCK_SIZE, .min_keysize = 2 * AES_MIN_KEY_SIZE, .max_keysize = 2 * AES_MAX_KEY_SIZE, .setkey = cvm_xts_setkey, Hope this is fine? Sure, please do not forget to invoke xts_verify_key. Should I be using xts_check_key or xts_verify_key? Ciao Stephan Regards, -George -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: x86-64: Maintain 16-byte stack alignment
* Herbert Xu wrote: > > But if we can't do this with automatic verification, then I'm not sure > > I want to do it at all. The asm is already more precarious than I'd > > like, and having a code path that is misaligned is asking for obscure > > bugs down the road. > > I understand the need for automated checks at this point in time. > But longer term this is just part of the calling ABI. After all, > we don't add checks everywhere to ensure people preserve rbx. The intelligent and responsible way to introduce such post facto ABI changes is via a smarter assembler: which would detect the obvious cases where assembly code generates a misaligned stack, at build time. Assembly code can obviously still mess up in a hard to detect fashion if it tries - but that's OK, as most assembly code doesn't try to go outside regular stack allocation patterns. Such a static check is relatively straightforward to do in assembly tooling - and perhaps objtool could do this too, as it already tracks the instructions that change the stack offset. ( And yes, this is what the GCC guys should have done, instead of sloppily introducing such silent breakages and making the whole application landscape less robust ... ) Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: x86-64: Maintain 16-byte stack alignment
On Thu, Jan 12, 2017 at 08:01:51AM +, Ard Biesheuvel wrote: > > [From memory] the arm64 ELF psABI mandates a 16 byte stack alignment > at function entry, and 8 byte alignment at all other times. This means > compiled code will typically preserve 16 byte alignment, and > __aligned(16) on a stack variable will likely not result in an > explicit alignment of the stack pointer *. But the arm64 ISA does not > have any load/store instructions that would trigger a misalignment > fault on an address that is 8 byte aligned but not 16 byte aligned, so > the situation is very different from x86 (assuming I am correct in > asserting that there are no failure modes resulting from a misaligned > stack other than this one and a potential performance hit) OK, sounds like we're already using 16-byte aligned stacks on ARM64. So unless x86-64 stacks are much smaller, I don't see the need to use 8-byte aligned stacks at least from a stack space point-of-view. Thanks, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: x86-64: Maintain 16-byte stack alignment
On Wed, Jan 11, 2017 at 11:51:10PM -0800, Andy Lutomirski wrote: > > The problem is that we have nasties like TRACE_IRQS_OFF. Performance I don't understand. What's the issue with TRACE_IRQS_OFF? It should be treated as any other function call. That is, enter it with an aligned stack, and the TRACE_IRQS_OFF code itself should ensure the stack stays aligned before it calls down into C. > But if we can't do this with automatic verification, then I'm not sure > I want to do it at all. The asm is already more precarious than I'd > like, and having a code path that is misaligned is asking for obscure > bugs down the road. I understand the need for automated checks at this point in time. But longer term this is just part of the calling ABI. After all, we don't add checks everywhere to ensure people preserve rbx. Cheers, -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: x86-64: Maintain 16-byte stack alignment
On 12 January 2017 at 06:12, Herbert Xu wrote: > On Tue, Jan 10, 2017 at 05:30:48PM +, Ard Biesheuvel wrote: >> >> Apologies for introducing this breakage. It seemed like an obvious and >> simple cleanup, so I didn't even bother to mention it in the commit >> log, but if the kernel does not guarantee 16 byte alignment, I guess >> we should revert to the old method. If SSE instructions are the only >> ones that require this alignment, then I suppose not having a ABI >> conforming stack pointer should not be an issue in general. > > BTW Ard, what is the stack alignment on ARM64? > [From memory] the arm64 ELF psABI mandates a 16 byte stack alignment at function entry, and 8 byte alignment at all other times. This means compiled code will typically preserve 16 byte alignment, and __aligned(16) on a stack variable will likely not result in an explicit alignment of the stack pointer *. But the arm64 ISA does not have any load/store instructions that would trigger a misalignment fault on an address that is 8 byte aligned but not 16 byte aligned, so the situation is very different from x86 (assuming I am correct in asserting that there are no failure modes resulting from a misaligned stack other than this one and a potential performance hit) * I didn't check whether the exception handling realigns the stack pointer on nested exceptions (arm64 has separate IRQ stacks) -- To unsubscribe from this list: send the line "unsubscribe linux-crypto" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html