Re: [PATCH] crypto: s5p-sss - Fix spinlock recursion on LRW(AES)

2017-03-09 Thread Herbert Xu
On Wed, Mar 08, 2017 at 11:14:20PM +0200, Krzysztof Kozlowski wrote:
> Running TCRYPT with LRW compiled causes spinlock recursion:
> 
> testing speed of async lrw(aes) (lrw(ecb-aes-s5p)) encryption
> tcrypt: test 0 (256 bit key, 16 byte blocks): 19007 operations in 1 
> seconds (304112 bytes)
> tcrypt: test 1 (256 bit key, 64 byte blocks): 15753 operations in 1 
> seconds (1008192 bytes)
> tcrypt: test 2 (256 bit key, 256 byte blocks): 14293 operations in 1 
> seconds (3659008 bytes)
> tcrypt: test 3 (256 bit key, 1024 byte blocks): 11906 operations in 1 
> seconds (12191744 bytes)
> tcrypt: test 4 (256 bit key, 8192 byte blocks):
> BUG: spinlock recursion on CPU#1, irq/84-1083/89
>  lock: 0xeea99a68, .magic: dead4ead, .owner: irq/84-1083/89, 
> .owner_cpu: 1
> CPU: 1 PID: 89 Comm: irq/84-1083 Not tainted 
> 4.11.0-rc1-1-g897ca6d0800d #559
> Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
> [] (unwind_backtrace) from [] (show_stack+0x10/0x14)
> [] (show_stack) from [] (dump_stack+0x78/0x8c)
> [] (dump_stack) from [] (do_raw_spin_lock+0x11c/0x120)
> [] (do_raw_spin_lock) from [] 
> (_raw_spin_lock_irqsave+0x20/0x28)
> [] (_raw_spin_lock_irqsave) from [] 
> (s5p_aes_crypt+0x2c/0xb4)
> [] (s5p_aes_crypt) from [] (do_encrypt+0x78/0xb0 
> [lrw])
> [] (do_encrypt [lrw]) from [] (encrypt_done+0x24/0x54 
> [lrw])
> [] (encrypt_done [lrw]) from [] 
> (s5p_aes_complete+0x60/0xcc)
> [] (s5p_aes_complete) from [] 
> (s5p_aes_interrupt+0x134/0x1a0)
> [] (s5p_aes_interrupt) from [] 
> (irq_thread_fn+0x1c/0x54)
> [] (irq_thread_fn) from [] (irq_thread+0x12c/0x1e0)
> [] (irq_thread) from [] (kthread+0x108/0x138)
> [] (kthread) from [] (ret_from_fork+0x14/0x3c)
> 
> Interrupt handling routine was calling req->base.complete() under
> spinlock.  In most cases this wasn't fatal but when combined with some
> of the cipher modes (like LRW) this caused recursion - starting the new
> encryption (s5p_aes_crypt()) while still holding the spinlock from
> previous round (s5p_aes_complete()).
> 
> Beside that, the s5p_aes_interrupt() error handling path could execute
> two completions in case of error for RX and TX blocks.
> 
> Rewrite the interrupt handling routine and the completion by:
> 
> 1. Splitting the operations on scatterlist copies from
>s5p_aes_complete() into separate s5p_sg_done(). This still should be
>done under lock.
>The s5p_aes_complete() now only calls req->base.complete() and it has
>to be called outside of lock.
> 
> 2. Moving the s5p_aes_complete() out of spinlock critical sections.
>In interrupt service routine s5p_aes_interrupts(), it appeared in few
>places, including error paths inside other functions called from ISR.
>This code was not so obvious to read so simplify it by putting the
>s5p_aes_complete() only within ISR level.
> 
> Reported-by: Nathan Royce 
> Cc:  # v4.10.x: 07de4bc88c crypto: s5p-sss - Fix 
> completing
> Cc:  # v4.10.x
> Signed-off-by: Krzysztof Kozlowski 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] crypto: s5p-sss - Fix spinlock recursion on LRW(AES)

2017-03-09 Thread Herbert Xu
On Wed, Mar 08, 2017 at 11:14:20PM +0200, Krzysztof Kozlowski wrote:
> Running TCRYPT with LRW compiled causes spinlock recursion:
> 
> testing speed of async lrw(aes) (lrw(ecb-aes-s5p)) encryption
> tcrypt: test 0 (256 bit key, 16 byte blocks): 19007 operations in 1 
> seconds (304112 bytes)
> tcrypt: test 1 (256 bit key, 64 byte blocks): 15753 operations in 1 
> seconds (1008192 bytes)
> tcrypt: test 2 (256 bit key, 256 byte blocks): 14293 operations in 1 
> seconds (3659008 bytes)
> tcrypt: test 3 (256 bit key, 1024 byte blocks): 11906 operations in 1 
> seconds (12191744 bytes)
> tcrypt: test 4 (256 bit key, 8192 byte blocks):
> BUG: spinlock recursion on CPU#1, irq/84-1083/89
>  lock: 0xeea99a68, .magic: dead4ead, .owner: irq/84-1083/89, 
> .owner_cpu: 1
> CPU: 1 PID: 89 Comm: irq/84-1083 Not tainted 
> 4.11.0-rc1-1-g897ca6d0800d #559
> Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
> [] (unwind_backtrace) from [] (show_stack+0x10/0x14)
> [] (show_stack) from [] (dump_stack+0x78/0x8c)
> [] (dump_stack) from [] (do_raw_spin_lock+0x11c/0x120)
> [] (do_raw_spin_lock) from [] 
> (_raw_spin_lock_irqsave+0x20/0x28)
> [] (_raw_spin_lock_irqsave) from [] 
> (s5p_aes_crypt+0x2c/0xb4)
> [] (s5p_aes_crypt) from [] (do_encrypt+0x78/0xb0 
> [lrw])
> [] (do_encrypt [lrw]) from [] (encrypt_done+0x24/0x54 
> [lrw])
> [] (encrypt_done [lrw]) from [] 
> (s5p_aes_complete+0x60/0xcc)
> [] (s5p_aes_complete) from [] 
> (s5p_aes_interrupt+0x134/0x1a0)
> [] (s5p_aes_interrupt) from [] 
> (irq_thread_fn+0x1c/0x54)
> [] (irq_thread_fn) from [] (irq_thread+0x12c/0x1e0)
> [] (irq_thread) from [] (kthread+0x108/0x138)
> [] (kthread) from [] (ret_from_fork+0x14/0x3c)
> 
> Interrupt handling routine was calling req->base.complete() under
> spinlock.  In most cases this wasn't fatal but when combined with some
> of the cipher modes (like LRW) this caused recursion - starting the new
> encryption (s5p_aes_crypt()) while still holding the spinlock from
> previous round (s5p_aes_complete()).
> 
> Beside that, the s5p_aes_interrupt() error handling path could execute
> two completions in case of error for RX and TX blocks.
> 
> Rewrite the interrupt handling routine and the completion by:
> 
> 1. Splitting the operations on scatterlist copies from
>s5p_aes_complete() into separate s5p_sg_done(). This still should be
>done under lock.
>The s5p_aes_complete() now only calls req->base.complete() and it has
>to be called outside of lock.
> 
> 2. Moving the s5p_aes_complete() out of spinlock critical sections.
>In interrupt service routine s5p_aes_interrupts(), it appeared in few
>places, including error paths inside other functions called from ISR.
>This code was not so obvious to read so simplify it by putting the
>s5p_aes_complete() only within ISR level.
> 
> Reported-by: Nathan Royce 
> Cc:  # v4.10.x: 07de4bc88c crypto: s5p-sss - Fix 
> completing
> Cc:  # v4.10.x
> Signed-off-by: Krzysztof Kozlowski 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


[PATCH] crypto: s5p-sss - Fix spinlock recursion on LRW(AES)

2017-03-08 Thread Krzysztof Kozlowski
Running TCRYPT with LRW compiled causes spinlock recursion:

testing speed of async lrw(aes) (lrw(ecb-aes-s5p)) encryption
tcrypt: test 0 (256 bit key, 16 byte blocks): 19007 operations in 1 seconds 
(304112 bytes)
tcrypt: test 1 (256 bit key, 64 byte blocks): 15753 operations in 1 seconds 
(1008192 bytes)
tcrypt: test 2 (256 bit key, 256 byte blocks): 14293 operations in 1 
seconds (3659008 bytes)
tcrypt: test 3 (256 bit key, 1024 byte blocks): 11906 operations in 1 
seconds (12191744 bytes)
tcrypt: test 4 (256 bit key, 8192 byte blocks):
BUG: spinlock recursion on CPU#1, irq/84-1083/89
 lock: 0xeea99a68, .magic: dead4ead, .owner: irq/84-1083/89, 
.owner_cpu: 1
CPU: 1 PID: 89 Comm: irq/84-1083 Not tainted 
4.11.0-rc1-1-g897ca6d0800d #559
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[] (unwind_backtrace) from [] (show_stack+0x10/0x14)
[] (show_stack) from [] (dump_stack+0x78/0x8c)
[] (dump_stack) from [] (do_raw_spin_lock+0x11c/0x120)
[] (do_raw_spin_lock) from [] 
(_raw_spin_lock_irqsave+0x20/0x28)
[] (_raw_spin_lock_irqsave) from [] 
(s5p_aes_crypt+0x2c/0xb4)
[] (s5p_aes_crypt) from [] (do_encrypt+0x78/0xb0 [lrw])
[] (do_encrypt [lrw]) from [] (encrypt_done+0x24/0x54 
[lrw])
[] (encrypt_done [lrw]) from [] 
(s5p_aes_complete+0x60/0xcc)
[] (s5p_aes_complete) from [] 
(s5p_aes_interrupt+0x134/0x1a0)
[] (s5p_aes_interrupt) from [] (irq_thread_fn+0x1c/0x54)
[] (irq_thread_fn) from [] (irq_thread+0x12c/0x1e0)
[] (irq_thread) from [] (kthread+0x108/0x138)
[] (kthread) from [] (ret_from_fork+0x14/0x3c)

Interrupt handling routine was calling req->base.complete() under
spinlock.  In most cases this wasn't fatal but when combined with some
of the cipher modes (like LRW) this caused recursion - starting the new
encryption (s5p_aes_crypt()) while still holding the spinlock from
previous round (s5p_aes_complete()).

Beside that, the s5p_aes_interrupt() error handling path could execute
two completions in case of error for RX and TX blocks.

Rewrite the interrupt handling routine and the completion by:

1. Splitting the operations on scatterlist copies from
   s5p_aes_complete() into separate s5p_sg_done(). This still should be
   done under lock.
   The s5p_aes_complete() now only calls req->base.complete() and it has
   to be called outside of lock.

2. Moving the s5p_aes_complete() out of spinlock critical sections.
   In interrupt service routine s5p_aes_interrupts(), it appeared in few
   places, including error paths inside other functions called from ISR.
   This code was not so obvious to read so simplify it by putting the
   s5p_aes_complete() only within ISR level.

Reported-by: Nathan Royce 
Cc:  # v4.10.x: 07de4bc88c crypto: s5p-sss - Fix 
completing
Cc:  # v4.10.x
Signed-off-by: Krzysztof Kozlowski 

---

I think, along with 07de4bc88c this should be backported to v4.10 as it
happens there.

Signed-off-by: Krzysztof Kozlowski 
---
 drivers/crypto/s5p-sss.c | 127 ++-
 1 file changed, 82 insertions(+), 45 deletions(-)

diff --git a/drivers/crypto/s5p-sss.c b/drivers/crypto/s5p-sss.c
index a668286d62cb..1b9da3dc799b 100644
--- a/drivers/crypto/s5p-sss.c
+++ b/drivers/crypto/s5p-sss.c
@@ -270,7 +270,7 @@ static void s5p_sg_copy_buf(void *buf, struct scatterlist 
*sg,
scatterwalk_done(, out, 0);
 }
 
-static void s5p_aes_complete(struct s5p_aes_dev *dev, int err)
+static void s5p_sg_done(struct s5p_aes_dev *dev)
 {
if (dev->sg_dst_cpy) {
dev_dbg(dev->dev,
@@ -281,8 +281,11 @@ static void s5p_aes_complete(struct s5p_aes_dev *dev, int 
err)
}
s5p_free_sg_cpy(dev, >sg_src_cpy);
s5p_free_sg_cpy(dev, >sg_dst_cpy);
+}
 
-   /* holding a lock outside */
+/* Calls the completion. Cannot be called with dev->lock hold. */
+static void s5p_aes_complete(struct s5p_aes_dev *dev, int err)
+{
dev->req->base.complete(>req->base, err);
dev->busy = false;
 }
@@ -368,51 +371,44 @@ static int s5p_set_indata(struct s5p_aes_dev *dev, struct 
scatterlist *sg)
 }
 
 /*
- * Returns true if new transmitting (output) data is ready and its
- * address+length have to be written to device (by calling
- * s5p_set_dma_outdata()). False otherwise.
+ * Returns -ERRNO on error (mapping of new data failed).
+ * On success returns:
+ *  - 0 if there is no more data,
+ *  - 1 if new transmitting (output) data is ready and its address+length
+ * have to be written to device (by calling s5p_set_dma_outdata()).
  */
-static bool s5p_aes_tx(struct s5p_aes_dev *dev)
+static int s5p_aes_tx(struct s5p_aes_dev *dev)
 {
-   int err = 0;
-   bool ret = false;
+   int ret = 0;
 
s5p_unset_outdata(dev);
 
if (!sg_is_last(dev->sg_dst)) {
-   err = s5p_set_outdata(dev, 

[PATCH] crypto: s5p-sss - Fix spinlock recursion on LRW(AES)

2017-03-08 Thread Krzysztof Kozlowski
Running TCRYPT with LRW compiled causes spinlock recursion:

testing speed of async lrw(aes) (lrw(ecb-aes-s5p)) encryption
tcrypt: test 0 (256 bit key, 16 byte blocks): 19007 operations in 1 seconds 
(304112 bytes)
tcrypt: test 1 (256 bit key, 64 byte blocks): 15753 operations in 1 seconds 
(1008192 bytes)
tcrypt: test 2 (256 bit key, 256 byte blocks): 14293 operations in 1 
seconds (3659008 bytes)
tcrypt: test 3 (256 bit key, 1024 byte blocks): 11906 operations in 1 
seconds (12191744 bytes)
tcrypt: test 4 (256 bit key, 8192 byte blocks):
BUG: spinlock recursion on CPU#1, irq/84-1083/89
 lock: 0xeea99a68, .magic: dead4ead, .owner: irq/84-1083/89, 
.owner_cpu: 1
CPU: 1 PID: 89 Comm: irq/84-1083 Not tainted 
4.11.0-rc1-1-g897ca6d0800d #559
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[] (unwind_backtrace) from [] (show_stack+0x10/0x14)
[] (show_stack) from [] (dump_stack+0x78/0x8c)
[] (dump_stack) from [] (do_raw_spin_lock+0x11c/0x120)
[] (do_raw_spin_lock) from [] 
(_raw_spin_lock_irqsave+0x20/0x28)
[] (_raw_spin_lock_irqsave) from [] 
(s5p_aes_crypt+0x2c/0xb4)
[] (s5p_aes_crypt) from [] (do_encrypt+0x78/0xb0 [lrw])
[] (do_encrypt [lrw]) from [] (encrypt_done+0x24/0x54 
[lrw])
[] (encrypt_done [lrw]) from [] 
(s5p_aes_complete+0x60/0xcc)
[] (s5p_aes_complete) from [] 
(s5p_aes_interrupt+0x134/0x1a0)
[] (s5p_aes_interrupt) from [] (irq_thread_fn+0x1c/0x54)
[] (irq_thread_fn) from [] (irq_thread+0x12c/0x1e0)
[] (irq_thread) from [] (kthread+0x108/0x138)
[] (kthread) from [] (ret_from_fork+0x14/0x3c)

Interrupt handling routine was calling req->base.complete() under
spinlock.  In most cases this wasn't fatal but when combined with some
of the cipher modes (like LRW) this caused recursion - starting the new
encryption (s5p_aes_crypt()) while still holding the spinlock from
previous round (s5p_aes_complete()).

Beside that, the s5p_aes_interrupt() error handling path could execute
two completions in case of error for RX and TX blocks.

Rewrite the interrupt handling routine and the completion by:

1. Splitting the operations on scatterlist copies from
   s5p_aes_complete() into separate s5p_sg_done(). This still should be
   done under lock.
   The s5p_aes_complete() now only calls req->base.complete() and it has
   to be called outside of lock.

2. Moving the s5p_aes_complete() out of spinlock critical sections.
   In interrupt service routine s5p_aes_interrupts(), it appeared in few
   places, including error paths inside other functions called from ISR.
   This code was not so obvious to read so simplify it by putting the
   s5p_aes_complete() only within ISR level.

Reported-by: Nathan Royce 
Cc:  # v4.10.x: 07de4bc88c crypto: s5p-sss - Fix 
completing
Cc:  # v4.10.x
Signed-off-by: Krzysztof Kozlowski 

---

I think, along with 07de4bc88c this should be backported to v4.10 as it
happens there.

Signed-off-by: Krzysztof Kozlowski 
---
 drivers/crypto/s5p-sss.c | 127 ++-
 1 file changed, 82 insertions(+), 45 deletions(-)

diff --git a/drivers/crypto/s5p-sss.c b/drivers/crypto/s5p-sss.c
index a668286d62cb..1b9da3dc799b 100644
--- a/drivers/crypto/s5p-sss.c
+++ b/drivers/crypto/s5p-sss.c
@@ -270,7 +270,7 @@ static void s5p_sg_copy_buf(void *buf, struct scatterlist 
*sg,
scatterwalk_done(, out, 0);
 }
 
-static void s5p_aes_complete(struct s5p_aes_dev *dev, int err)
+static void s5p_sg_done(struct s5p_aes_dev *dev)
 {
if (dev->sg_dst_cpy) {
dev_dbg(dev->dev,
@@ -281,8 +281,11 @@ static void s5p_aes_complete(struct s5p_aes_dev *dev, int 
err)
}
s5p_free_sg_cpy(dev, >sg_src_cpy);
s5p_free_sg_cpy(dev, >sg_dst_cpy);
+}
 
-   /* holding a lock outside */
+/* Calls the completion. Cannot be called with dev->lock hold. */
+static void s5p_aes_complete(struct s5p_aes_dev *dev, int err)
+{
dev->req->base.complete(>req->base, err);
dev->busy = false;
 }
@@ -368,51 +371,44 @@ static int s5p_set_indata(struct s5p_aes_dev *dev, struct 
scatterlist *sg)
 }
 
 /*
- * Returns true if new transmitting (output) data is ready and its
- * address+length have to be written to device (by calling
- * s5p_set_dma_outdata()). False otherwise.
+ * Returns -ERRNO on error (mapping of new data failed).
+ * On success returns:
+ *  - 0 if there is no more data,
+ *  - 1 if new transmitting (output) data is ready and its address+length
+ * have to be written to device (by calling s5p_set_dma_outdata()).
  */
-static bool s5p_aes_tx(struct s5p_aes_dev *dev)
+static int s5p_aes_tx(struct s5p_aes_dev *dev)
 {
-   int err = 0;
-   bool ret = false;
+   int ret = 0;
 
s5p_unset_outdata(dev);
 
if (!sg_is_last(dev->sg_dst)) {
-   err = s5p_set_outdata(dev, sg_next(dev->sg_dst));
-   if (err)
-   s5p_aes_complete(dev, err);
-