Re: [PATCH] crypto: s5p-sss - Fix spinlock recursion on LRW(AES)
On Wed, Mar 08, 2017 at 11:14:20PM +0200, Krzysztof Kozlowski wrote: > Running TCRYPT with LRW compiled causes spinlock recursion: > > testing speed of async lrw(aes) (lrw(ecb-aes-s5p)) encryption > tcrypt: test 0 (256 bit key, 16 byte blocks): 19007 operations in 1 > seconds (304112 bytes) > tcrypt: test 1 (256 bit key, 64 byte blocks): 15753 operations in 1 > seconds (1008192 bytes) > tcrypt: test 2 (256 bit key, 256 byte blocks): 14293 operations in 1 > seconds (3659008 bytes) > tcrypt: test 3 (256 bit key, 1024 byte blocks): 11906 operations in 1 > seconds (12191744 bytes) > tcrypt: test 4 (256 bit key, 8192 byte blocks): > BUG: spinlock recursion on CPU#1, irq/84-1083/89 > lock: 0xeea99a68, .magic: dead4ead, .owner: irq/84-1083/89, > .owner_cpu: 1 > CPU: 1 PID: 89 Comm: irq/84-1083 Not tainted > 4.11.0-rc1-1-g897ca6d0800d #559 > Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) > [] (unwind_backtrace) from [] (show_stack+0x10/0x14) > [] (show_stack) from [] (dump_stack+0x78/0x8c) > [] (dump_stack) from [] (do_raw_spin_lock+0x11c/0x120) > [] (do_raw_spin_lock) from [] > (_raw_spin_lock_irqsave+0x20/0x28) > [] (_raw_spin_lock_irqsave) from [] > (s5p_aes_crypt+0x2c/0xb4) > [] (s5p_aes_crypt) from [] (do_encrypt+0x78/0xb0 > [lrw]) > [] (do_encrypt [lrw]) from [] (encrypt_done+0x24/0x54 > [lrw]) > [] (encrypt_done [lrw]) from [] > (s5p_aes_complete+0x60/0xcc) > [] (s5p_aes_complete) from [] > (s5p_aes_interrupt+0x134/0x1a0) > [] (s5p_aes_interrupt) from [] > (irq_thread_fn+0x1c/0x54) > [] (irq_thread_fn) from [] (irq_thread+0x12c/0x1e0) > [] (irq_thread) from [] (kthread+0x108/0x138) > [] (kthread) from [] (ret_from_fork+0x14/0x3c) > > Interrupt handling routine was calling req->base.complete() under > spinlock. In most cases this wasn't fatal but when combined with some > of the cipher modes (like LRW) this caused recursion - starting the new > encryption (s5p_aes_crypt()) while still holding the spinlock from > previous round (s5p_aes_complete()). > > Beside that, the s5p_aes_interrupt() error handling path could execute > two completions in case of error for RX and TX blocks. > > Rewrite the interrupt handling routine and the completion by: > > 1. Splitting the operations on scatterlist copies from >s5p_aes_complete() into separate s5p_sg_done(). This still should be >done under lock. >The s5p_aes_complete() now only calls req->base.complete() and it has >to be called outside of lock. > > 2. Moving the s5p_aes_complete() out of spinlock critical sections. >In interrupt service routine s5p_aes_interrupts(), it appeared in few >places, including error paths inside other functions called from ISR. >This code was not so obvious to read so simplify it by putting the >s5p_aes_complete() only within ISR level. > > Reported-by: Nathan Royce> Cc: # v4.10.x: 07de4bc88c crypto: s5p-sss - Fix > completing > Cc: # v4.10.x > Signed-off-by: Krzysztof Kozlowski Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Re: [PATCH] crypto: s5p-sss - Fix spinlock recursion on LRW(AES)
On Wed, Mar 08, 2017 at 11:14:20PM +0200, Krzysztof Kozlowski wrote: > Running TCRYPT with LRW compiled causes spinlock recursion: > > testing speed of async lrw(aes) (lrw(ecb-aes-s5p)) encryption > tcrypt: test 0 (256 bit key, 16 byte blocks): 19007 operations in 1 > seconds (304112 bytes) > tcrypt: test 1 (256 bit key, 64 byte blocks): 15753 operations in 1 > seconds (1008192 bytes) > tcrypt: test 2 (256 bit key, 256 byte blocks): 14293 operations in 1 > seconds (3659008 bytes) > tcrypt: test 3 (256 bit key, 1024 byte blocks): 11906 operations in 1 > seconds (12191744 bytes) > tcrypt: test 4 (256 bit key, 8192 byte blocks): > BUG: spinlock recursion on CPU#1, irq/84-1083/89 > lock: 0xeea99a68, .magic: dead4ead, .owner: irq/84-1083/89, > .owner_cpu: 1 > CPU: 1 PID: 89 Comm: irq/84-1083 Not tainted > 4.11.0-rc1-1-g897ca6d0800d #559 > Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) > [] (unwind_backtrace) from [] (show_stack+0x10/0x14) > [] (show_stack) from [] (dump_stack+0x78/0x8c) > [] (dump_stack) from [] (do_raw_spin_lock+0x11c/0x120) > [] (do_raw_spin_lock) from [] > (_raw_spin_lock_irqsave+0x20/0x28) > [] (_raw_spin_lock_irqsave) from [] > (s5p_aes_crypt+0x2c/0xb4) > [] (s5p_aes_crypt) from [] (do_encrypt+0x78/0xb0 > [lrw]) > [] (do_encrypt [lrw]) from [] (encrypt_done+0x24/0x54 > [lrw]) > [] (encrypt_done [lrw]) from [] > (s5p_aes_complete+0x60/0xcc) > [] (s5p_aes_complete) from [] > (s5p_aes_interrupt+0x134/0x1a0) > [] (s5p_aes_interrupt) from [] > (irq_thread_fn+0x1c/0x54) > [] (irq_thread_fn) from [] (irq_thread+0x12c/0x1e0) > [] (irq_thread) from [] (kthread+0x108/0x138) > [] (kthread) from [] (ret_from_fork+0x14/0x3c) > > Interrupt handling routine was calling req->base.complete() under > spinlock. In most cases this wasn't fatal but when combined with some > of the cipher modes (like LRW) this caused recursion - starting the new > encryption (s5p_aes_crypt()) while still holding the spinlock from > previous round (s5p_aes_complete()). > > Beside that, the s5p_aes_interrupt() error handling path could execute > two completions in case of error for RX and TX blocks. > > Rewrite the interrupt handling routine and the completion by: > > 1. Splitting the operations on scatterlist copies from >s5p_aes_complete() into separate s5p_sg_done(). This still should be >done under lock. >The s5p_aes_complete() now only calls req->base.complete() and it has >to be called outside of lock. > > 2. Moving the s5p_aes_complete() out of spinlock critical sections. >In interrupt service routine s5p_aes_interrupts(), it appeared in few >places, including error paths inside other functions called from ISR. >This code was not so obvious to read so simplify it by putting the >s5p_aes_complete() only within ISR level. > > Reported-by: Nathan Royce > Cc: # v4.10.x: 07de4bc88c crypto: s5p-sss - Fix > completing > Cc: # v4.10.x > Signed-off-by: Krzysztof Kozlowski Patch applied. Thanks. -- Email: Herbert Xu Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
[PATCH] crypto: s5p-sss - Fix spinlock recursion on LRW(AES)
Running TCRYPT with LRW compiled causes spinlock recursion: testing speed of async lrw(aes) (lrw(ecb-aes-s5p)) encryption tcrypt: test 0 (256 bit key, 16 byte blocks): 19007 operations in 1 seconds (304112 bytes) tcrypt: test 1 (256 bit key, 64 byte blocks): 15753 operations in 1 seconds (1008192 bytes) tcrypt: test 2 (256 bit key, 256 byte blocks): 14293 operations in 1 seconds (3659008 bytes) tcrypt: test 3 (256 bit key, 1024 byte blocks): 11906 operations in 1 seconds (12191744 bytes) tcrypt: test 4 (256 bit key, 8192 byte blocks): BUG: spinlock recursion on CPU#1, irq/84-1083/89 lock: 0xeea99a68, .magic: dead4ead, .owner: irq/84-1083/89, .owner_cpu: 1 CPU: 1 PID: 89 Comm: irq/84-1083 Not tainted 4.11.0-rc1-1-g897ca6d0800d #559 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [] (unwind_backtrace) from [] (show_stack+0x10/0x14) [] (show_stack) from [] (dump_stack+0x78/0x8c) [] (dump_stack) from [] (do_raw_spin_lock+0x11c/0x120) [] (do_raw_spin_lock) from [] (_raw_spin_lock_irqsave+0x20/0x28) [] (_raw_spin_lock_irqsave) from [] (s5p_aes_crypt+0x2c/0xb4) [] (s5p_aes_crypt) from [] (do_encrypt+0x78/0xb0 [lrw]) [] (do_encrypt [lrw]) from [] (encrypt_done+0x24/0x54 [lrw]) [] (encrypt_done [lrw]) from [] (s5p_aes_complete+0x60/0xcc) [] (s5p_aes_complete) from [] (s5p_aes_interrupt+0x134/0x1a0) [] (s5p_aes_interrupt) from [] (irq_thread_fn+0x1c/0x54) [] (irq_thread_fn) from [] (irq_thread+0x12c/0x1e0) [] (irq_thread) from [] (kthread+0x108/0x138) [] (kthread) from [] (ret_from_fork+0x14/0x3c) Interrupt handling routine was calling req->base.complete() under spinlock. In most cases this wasn't fatal but when combined with some of the cipher modes (like LRW) this caused recursion - starting the new encryption (s5p_aes_crypt()) while still holding the spinlock from previous round (s5p_aes_complete()). Beside that, the s5p_aes_interrupt() error handling path could execute two completions in case of error for RX and TX blocks. Rewrite the interrupt handling routine and the completion by: 1. Splitting the operations on scatterlist copies from s5p_aes_complete() into separate s5p_sg_done(). This still should be done under lock. The s5p_aes_complete() now only calls req->base.complete() and it has to be called outside of lock. 2. Moving the s5p_aes_complete() out of spinlock critical sections. In interrupt service routine s5p_aes_interrupts(), it appeared in few places, including error paths inside other functions called from ISR. This code was not so obvious to read so simplify it by putting the s5p_aes_complete() only within ISR level. Reported-by: Nathan RoyceCc: # v4.10.x: 07de4bc88c crypto: s5p-sss - Fix completing Cc: # v4.10.x Signed-off-by: Krzysztof Kozlowski --- I think, along with 07de4bc88c this should be backported to v4.10 as it happens there. Signed-off-by: Krzysztof Kozlowski --- drivers/crypto/s5p-sss.c | 127 ++- 1 file changed, 82 insertions(+), 45 deletions(-) diff --git a/drivers/crypto/s5p-sss.c b/drivers/crypto/s5p-sss.c index a668286d62cb..1b9da3dc799b 100644 --- a/drivers/crypto/s5p-sss.c +++ b/drivers/crypto/s5p-sss.c @@ -270,7 +270,7 @@ static void s5p_sg_copy_buf(void *buf, struct scatterlist *sg, scatterwalk_done(, out, 0); } -static void s5p_aes_complete(struct s5p_aes_dev *dev, int err) +static void s5p_sg_done(struct s5p_aes_dev *dev) { if (dev->sg_dst_cpy) { dev_dbg(dev->dev, @@ -281,8 +281,11 @@ static void s5p_aes_complete(struct s5p_aes_dev *dev, int err) } s5p_free_sg_cpy(dev, >sg_src_cpy); s5p_free_sg_cpy(dev, >sg_dst_cpy); +} - /* holding a lock outside */ +/* Calls the completion. Cannot be called with dev->lock hold. */ +static void s5p_aes_complete(struct s5p_aes_dev *dev, int err) +{ dev->req->base.complete(>req->base, err); dev->busy = false; } @@ -368,51 +371,44 @@ static int s5p_set_indata(struct s5p_aes_dev *dev, struct scatterlist *sg) } /* - * Returns true if new transmitting (output) data is ready and its - * address+length have to be written to device (by calling - * s5p_set_dma_outdata()). False otherwise. + * Returns -ERRNO on error (mapping of new data failed). + * On success returns: + * - 0 if there is no more data, + * - 1 if new transmitting (output) data is ready and its address+length + * have to be written to device (by calling s5p_set_dma_outdata()). */ -static bool s5p_aes_tx(struct s5p_aes_dev *dev) +static int s5p_aes_tx(struct s5p_aes_dev *dev) { - int err = 0; - bool ret = false; + int ret = 0; s5p_unset_outdata(dev); if (!sg_is_last(dev->sg_dst)) { - err = s5p_set_outdata(dev,
[PATCH] crypto: s5p-sss - Fix spinlock recursion on LRW(AES)
Running TCRYPT with LRW compiled causes spinlock recursion: testing speed of async lrw(aes) (lrw(ecb-aes-s5p)) encryption tcrypt: test 0 (256 bit key, 16 byte blocks): 19007 operations in 1 seconds (304112 bytes) tcrypt: test 1 (256 bit key, 64 byte blocks): 15753 operations in 1 seconds (1008192 bytes) tcrypt: test 2 (256 bit key, 256 byte blocks): 14293 operations in 1 seconds (3659008 bytes) tcrypt: test 3 (256 bit key, 1024 byte blocks): 11906 operations in 1 seconds (12191744 bytes) tcrypt: test 4 (256 bit key, 8192 byte blocks): BUG: spinlock recursion on CPU#1, irq/84-1083/89 lock: 0xeea99a68, .magic: dead4ead, .owner: irq/84-1083/89, .owner_cpu: 1 CPU: 1 PID: 89 Comm: irq/84-1083 Not tainted 4.11.0-rc1-1-g897ca6d0800d #559 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [] (unwind_backtrace) from [] (show_stack+0x10/0x14) [] (show_stack) from [] (dump_stack+0x78/0x8c) [] (dump_stack) from [] (do_raw_spin_lock+0x11c/0x120) [] (do_raw_spin_lock) from [] (_raw_spin_lock_irqsave+0x20/0x28) [] (_raw_spin_lock_irqsave) from [] (s5p_aes_crypt+0x2c/0xb4) [] (s5p_aes_crypt) from [] (do_encrypt+0x78/0xb0 [lrw]) [] (do_encrypt [lrw]) from [] (encrypt_done+0x24/0x54 [lrw]) [] (encrypt_done [lrw]) from [] (s5p_aes_complete+0x60/0xcc) [] (s5p_aes_complete) from [] (s5p_aes_interrupt+0x134/0x1a0) [] (s5p_aes_interrupt) from [] (irq_thread_fn+0x1c/0x54) [] (irq_thread_fn) from [] (irq_thread+0x12c/0x1e0) [] (irq_thread) from [] (kthread+0x108/0x138) [] (kthread) from [] (ret_from_fork+0x14/0x3c) Interrupt handling routine was calling req->base.complete() under spinlock. In most cases this wasn't fatal but when combined with some of the cipher modes (like LRW) this caused recursion - starting the new encryption (s5p_aes_crypt()) while still holding the spinlock from previous round (s5p_aes_complete()). Beside that, the s5p_aes_interrupt() error handling path could execute two completions in case of error for RX and TX blocks. Rewrite the interrupt handling routine and the completion by: 1. Splitting the operations on scatterlist copies from s5p_aes_complete() into separate s5p_sg_done(). This still should be done under lock. The s5p_aes_complete() now only calls req->base.complete() and it has to be called outside of lock. 2. Moving the s5p_aes_complete() out of spinlock critical sections. In interrupt service routine s5p_aes_interrupts(), it appeared in few places, including error paths inside other functions called from ISR. This code was not so obvious to read so simplify it by putting the s5p_aes_complete() only within ISR level. Reported-by: Nathan Royce Cc: # v4.10.x: 07de4bc88c crypto: s5p-sss - Fix completing Cc: # v4.10.x Signed-off-by: Krzysztof Kozlowski --- I think, along with 07de4bc88c this should be backported to v4.10 as it happens there. Signed-off-by: Krzysztof Kozlowski --- drivers/crypto/s5p-sss.c | 127 ++- 1 file changed, 82 insertions(+), 45 deletions(-) diff --git a/drivers/crypto/s5p-sss.c b/drivers/crypto/s5p-sss.c index a668286d62cb..1b9da3dc799b 100644 --- a/drivers/crypto/s5p-sss.c +++ b/drivers/crypto/s5p-sss.c @@ -270,7 +270,7 @@ static void s5p_sg_copy_buf(void *buf, struct scatterlist *sg, scatterwalk_done(, out, 0); } -static void s5p_aes_complete(struct s5p_aes_dev *dev, int err) +static void s5p_sg_done(struct s5p_aes_dev *dev) { if (dev->sg_dst_cpy) { dev_dbg(dev->dev, @@ -281,8 +281,11 @@ static void s5p_aes_complete(struct s5p_aes_dev *dev, int err) } s5p_free_sg_cpy(dev, >sg_src_cpy); s5p_free_sg_cpy(dev, >sg_dst_cpy); +} - /* holding a lock outside */ +/* Calls the completion. Cannot be called with dev->lock hold. */ +static void s5p_aes_complete(struct s5p_aes_dev *dev, int err) +{ dev->req->base.complete(>req->base, err); dev->busy = false; } @@ -368,51 +371,44 @@ static int s5p_set_indata(struct s5p_aes_dev *dev, struct scatterlist *sg) } /* - * Returns true if new transmitting (output) data is ready and its - * address+length have to be written to device (by calling - * s5p_set_dma_outdata()). False otherwise. + * Returns -ERRNO on error (mapping of new data failed). + * On success returns: + * - 0 if there is no more data, + * - 1 if new transmitting (output) data is ready and its address+length + * have to be written to device (by calling s5p_set_dma_outdata()). */ -static bool s5p_aes_tx(struct s5p_aes_dev *dev) +static int s5p_aes_tx(struct s5p_aes_dev *dev) { - int err = 0; - bool ret = false; + int ret = 0; s5p_unset_outdata(dev); if (!sg_is_last(dev->sg_dst)) { - err = s5p_set_outdata(dev, sg_next(dev->sg_dst)); - if (err) - s5p_aes_complete(dev, err); -