Re: [PATCH] dmaengine: pl330: Fix race in pl330_get_desc()

2016-04-27 Thread Jassi Brar
On 27 April 2016 at 19:17, Robin Murphy  wrote:

>> Instead of churning the code, I would suggest either check in a loop
>> that we have a desc OR allocate 2 or NR_DEFAULT_DESC descriptors
>> there. Probably we get more descriptors at the same cost of memory.
>
>
> Having had a quick look into how beneficial that might be, I discover that
> what's provoking the add_desc() race is something else causing
> desc_pool->head to get wedged pointing at itself, so list_empty() forever
> returns true and we kick off a storm of allocations while desc_pool->tail
> fills up with unused descriptors. Looks like I'm going to have to set aside
> some time to properly debug and fix this driver if I want to use it for
> stress-testing IOMMU code...
>
OK, thanks.
And yes, we still need to fix that potential race in pl330_get_desc,
probably by retrying in a loop.

>>>
>>> I'm also seeing what looks like another occasional race under the same
>>> conditions where pl330_tx_submit() blows up from dma_cookie_assign()
>>> dereferencing a bogus tx->chan, but I think that's beyond my ability to
>>> figure out right now. Similarly the storm of WARNs from
>>> pl330_issue_pending()
>>> when using a large number of small buffers and dmatest.noverify=1. This
>>> one was some obvious low-hanging fruit.
>>>
>> Sorry, that part of code has changed a lot since I wrote the driver,
>> so more details will help me.
>
>
> Here's the actual splat:
>
> [  220.649267] Unable to handle kernel paging request at virtual address
> 100341338
> [  220.666576] pgd = ff8008dcb000
> [  220.679805] [100341338] *pgd=, *pud=
> [  220.696116] Internal error: Oops: 9605 [#1] PREEMPT SMP
> [  220.711579] Modules linked in:
> [  220.724459] CPU: 3 PID: 1249 Comm: dma0chan5-copy0 Not tainted 4.6.0-rc4+
> #527
> [  220.741652] Hardware name: ARM Juno development board (r1) (DT)
> [  220.757512] task: ffc976255400 ti: ffc975aa4000 task.ti:
> ffc975aa4000
> [  220.774979] PC is at pl330_tx_submit+0x68/0x108
> [  220.789412] LR is at pl330_tx_submit+0x2c/0x108
> [  220.803704] pc : [] lr : [] pstate:
> 81c5
> [  220.821078] sp : ffc975aa7c90
> [  220.834236] x29: ffc975aa7c90 x28: ffc975e81800
> [  220.849303] x27: ffc97626c018 x26: ff8008cd9000
> [  220.864314] x25:  x24: 
> [  220.879300] x23: ff8008d8d410 x22: ff8008ce52f0
> [  220.894266] x21: ffc976220d18 x20: ffc079138610
> [  220.909139] x19: ffc976220c60 x18: 0010
> [  220.923857] x17: 0007 x16: 0001
> [  220.938418] x15: 0001 x14: 0001020304050607
> [  220.952886] x13: 08090a0b0c0d0e0f x12: 1011121314151617
> [  220.967305] x11: 18191a1b1c1d1e1f x10: 0001020304050607
> [  220.981603] x9 :  x8 : 0010
> [  220.995783] x7 : 0001 x6 : ffc976220ce0
> [  221.009899] x5 : 000100341330 x4 : ffc079138600
> [  221.023775] x3 :  x2 : ffc97626c158
> [  221.037417] x1 : ffc97636a790 x0 : 0140
> [  221.050845]
> [  221.060126] Process dma0chan5-copy0 (pid: 1249, stack limit =
> 0xffc975aa4020)
> [  221.075704] Stack: (0xffc975aa7c90 to 0xffc975aa8000)
> [  221.089574] 7c80:   ffc975aa7cd0
> ff80083d9e60
> [  221.105653] 7ca0: ffc9758b88c0 2b12 20b8
> ff8008ce52f0
> [  221.121765] 7cc0: f9f82b12 ffc0790e5b00 ffc975aa7e30
> ff80080d7138
> [  221.137833] 7ce0: ffc975e81780 ff8008d90b00 ff8008af0d40
> ffc975e81800
> [  221.153811] 7d00: ff80083d91c0  
> 
> [  221.169777] 7d20:   ff8008ca7000
> 00323ef9fd9c
> [  221.185714] 7d40: 0008 ffc975aa7e20 ff8008b265d8
> 
> [  221.201753] 7d60: 0010  000175aa7dc0
> 
> [  221.217898] 7d80: 0259 0001 ff8008b265f0
> ffc9758b88e8
> [  221.234155] 7da0: ff802b12 ffc975aa7cd0 ffc976220c88
> 005d99c3
> [  221.250577] 7dc0: 0001 02f40259 ff8008d12be0
> ffc975aa7cc0
> [  221.267166] 7de0:  ffc975aa7df0 ff800bcc0bcc
> ffc975aa7df8
> [  221.283840] 7e00: ffc975aa7df8 c5b6eabfb5a2c4b8 ffc9
> ff80080f2048
> [  221.300672] 7e20: ffc975aa7e20 ffc975aa7e20 
> ff8008085e10
> [  221.317405] 7e40: ff80080d7068 ffc975e81780 
> 
> [  221.334211] 7e60:  ff80080df900 ff80080d7068
> 
> [  221.351119] 7e80:  ffc975e81800 
> 
> [  221.368086] 7ea0: ffc975aa7ea0 ffc975aa7ea0 
> ff80
> [  221.385128] 7ec0: 

Re: [PATCH] dmaengine: pl330: Fix race in pl330_get_desc()

2016-04-27 Thread Jassi Brar
On 27 April 2016 at 19:17, Robin Murphy  wrote:

>> Instead of churning the code, I would suggest either check in a loop
>> that we have a desc OR allocate 2 or NR_DEFAULT_DESC descriptors
>> there. Probably we get more descriptors at the same cost of memory.
>
>
> Having had a quick look into how beneficial that might be, I discover that
> what's provoking the add_desc() race is something else causing
> desc_pool->head to get wedged pointing at itself, so list_empty() forever
> returns true and we kick off a storm of allocations while desc_pool->tail
> fills up with unused descriptors. Looks like I'm going to have to set aside
> some time to properly debug and fix this driver if I want to use it for
> stress-testing IOMMU code...
>
OK, thanks.
And yes, we still need to fix that potential race in pl330_get_desc,
probably by retrying in a loop.

>>>
>>> I'm also seeing what looks like another occasional race under the same
>>> conditions where pl330_tx_submit() blows up from dma_cookie_assign()
>>> dereferencing a bogus tx->chan, but I think that's beyond my ability to
>>> figure out right now. Similarly the storm of WARNs from
>>> pl330_issue_pending()
>>> when using a large number of small buffers and dmatest.noverify=1. This
>>> one was some obvious low-hanging fruit.
>>>
>> Sorry, that part of code has changed a lot since I wrote the driver,
>> so more details will help me.
>
>
> Here's the actual splat:
>
> [  220.649267] Unable to handle kernel paging request at virtual address
> 100341338
> [  220.666576] pgd = ff8008dcb000
> [  220.679805] [100341338] *pgd=, *pud=
> [  220.696116] Internal error: Oops: 9605 [#1] PREEMPT SMP
> [  220.711579] Modules linked in:
> [  220.724459] CPU: 3 PID: 1249 Comm: dma0chan5-copy0 Not tainted 4.6.0-rc4+
> #527
> [  220.741652] Hardware name: ARM Juno development board (r1) (DT)
> [  220.757512] task: ffc976255400 ti: ffc975aa4000 task.ti:
> ffc975aa4000
> [  220.774979] PC is at pl330_tx_submit+0x68/0x108
> [  220.789412] LR is at pl330_tx_submit+0x2c/0x108
> [  220.803704] pc : [] lr : [] pstate:
> 81c5
> [  220.821078] sp : ffc975aa7c90
> [  220.834236] x29: ffc975aa7c90 x28: ffc975e81800
> [  220.849303] x27: ffc97626c018 x26: ff8008cd9000
> [  220.864314] x25:  x24: 
> [  220.879300] x23: ff8008d8d410 x22: ff8008ce52f0
> [  220.894266] x21: ffc976220d18 x20: ffc079138610
> [  220.909139] x19: ffc976220c60 x18: 0010
> [  220.923857] x17: 0007 x16: 0001
> [  220.938418] x15: 0001 x14: 0001020304050607
> [  220.952886] x13: 08090a0b0c0d0e0f x12: 1011121314151617
> [  220.967305] x11: 18191a1b1c1d1e1f x10: 0001020304050607
> [  220.981603] x9 :  x8 : 0010
> [  220.995783] x7 : 0001 x6 : ffc976220ce0
> [  221.009899] x5 : 000100341330 x4 : ffc079138600
> [  221.023775] x3 :  x2 : ffc97626c158
> [  221.037417] x1 : ffc97636a790 x0 : 0140
> [  221.050845]
> [  221.060126] Process dma0chan5-copy0 (pid: 1249, stack limit =
> 0xffc975aa4020)
> [  221.075704] Stack: (0xffc975aa7c90 to 0xffc975aa8000)
> [  221.089574] 7c80:   ffc975aa7cd0
> ff80083d9e60
> [  221.105653] 7ca0: ffc9758b88c0 2b12 20b8
> ff8008ce52f0
> [  221.121765] 7cc0: f9f82b12 ffc0790e5b00 ffc975aa7e30
> ff80080d7138
> [  221.137833] 7ce0: ffc975e81780 ff8008d90b00 ff8008af0d40
> ffc975e81800
> [  221.153811] 7d00: ff80083d91c0  
> 
> [  221.169777] 7d20:   ff8008ca7000
> 00323ef9fd9c
> [  221.185714] 7d40: 0008 ffc975aa7e20 ff8008b265d8
> 
> [  221.201753] 7d60: 0010  000175aa7dc0
> 
> [  221.217898] 7d80: 0259 0001 ff8008b265f0
> ffc9758b88e8
> [  221.234155] 7da0: ff802b12 ffc975aa7cd0 ffc976220c88
> 005d99c3
> [  221.250577] 7dc0: 0001 02f40259 ff8008d12be0
> ffc975aa7cc0
> [  221.267166] 7de0:  ffc975aa7df0 ff800bcc0bcc
> ffc975aa7df8
> [  221.283840] 7e00: ffc975aa7df8 c5b6eabfb5a2c4b8 ffc9
> ff80080f2048
> [  221.300672] 7e20: ffc975aa7e20 ffc975aa7e20 
> ff8008085e10
> [  221.317405] 7e40: ff80080d7068 ffc975e81780 
> 
> [  221.334211] 7e60:  ff80080df900 ff80080d7068
> 
> [  221.351119] 7e80:  ffc975e81800 
> 
> [  221.368086] 7ea0: ffc975aa7ea0 ffc975aa7ea0 
> ff80
> [  221.385128] 7ec0: ffc975aa7ec0 ffc975aa7ec0 

Re: [PATCH] dmaengine: pl330: Fix race in pl330_get_desc()

2016-04-27 Thread Robin Murphy

Hi Jassi,

On 27/04/16 04:26, Jassi Brar wrote:

On Tue, Apr 26, 2016 at 10:00 PM, Robin Murphy  wrote:

The current logic in pl330_get_desc() contains a clear race condition,
whereby if the descriptor pool is empty, we will create a new
descriptor, add it to the pool with the lock held, *release the lock*,
then try to remove it from the pool again.

Furthermore, if that second attempt then finds the pool empty because
someone else got there first, we conclude that something must have gone
terribly wrong and it's the absolute end of the world.


We may retry or simply increase the number of descriptors allocated in
a batch from 1 to, say, NR_DEFAULT_DESC.

diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c
index 372b435..7179a4d 100644
--- a/drivers/dma/pl330.c
+++ b/drivers/dma/pl330.c
@@ -2449,7 +2449,7 @@ static struct dma_pl330_desc
*pl330_get_desc(struct dma_pl330_chan *pch)

 /* If the DMAC pool is empty, alloc new */
 if (!desc) {
-   if (!add_desc(pl330, GFP_ATOMIC, 1))
+   if (!add_desc(pl330, GFP_ATOMIC, NR_DEFAULT_DESC))
 return NULL;

 /* Try again */


I'm not sure I see how leaving the obvious race in place and just making 
it n times harder to hit is in any way not a terrible idea compared to 
actually fixing it.



...
[  709.328891] dma-pl330 7ff0.dma: pl330_get_desc:2459 ALERT!
** 10 printk messages dropped ** [  709.352280] dma-pl330 7ff0.dma: 
__pl330_prep_dma_memcpy:2493 Unable to fetch desc
** 11 printk messages dropped ** [  709.372930] dma-pl330 7ff0.dma: 
pl330_get_desc:2459 ALERT!
** 2 printk messages dropped ** [  709.372953] dma-pl330 7ff0.dma: 
pl330_get_desc:2459 ALERT!
** 8 printk messages dropped ** [  709.374157] dma-pl330 7ff0.dma: 
__pl330_prep_dma_memcpy:2493 Unable to fetch desc
** 41 printk messages dropped ** [  709.393095] dmatest: dma0chan4-copy1: 
result #4545: 'prep error' with src_off=0x3a32 dst_off=0x46dd len=0xc5c3 (0)
...

Down with this sort of thing! The most sensible thing to do is avoid the
allocate-add-remove dance entirely and simply use the freshly-allocated
descriptor straight away.


... but you still try to first pluck from the list.


Well, that's the point of having a pool, right? Try to reuse a free 
descriptor if there is one, otherwise expand the pool. It's just that in 
the latter case, race or not, it's absolutely pointless to add the new 
descriptor to the pool only to immediately remove it again, when you can 
simply start using it instead - either way there's still an extra 
descriptor released back to the pool once we're done with it.



Instead of churning the code, I would suggest either check in a loop
that we have a desc OR allocate 2 or NR_DEFAULT_DESC descriptors
there. Probably we get more descriptors at the same cost of memory.


Having had a quick look into how beneficial that might be, I discover 
that what's provoking the add_desc() race is something else causing 
desc_pool->head to get wedged pointing at itself, so list_empty() 
forever returns true and we kick off a storm of allocations while 
desc_pool->tail fills up with unused descriptors. Looks like I'm going 
to have to set aside some time to properly debug and fix this driver if 
I want to use it for stress-testing IOMMU code...




I'm also seeing what looks like another occasional race under the same
conditions where pl330_tx_submit() blows up from dma_cookie_assign()
dereferencing a bogus tx->chan, but I think that's beyond my ability to
figure out right now. Similarly the storm of WARNs from pl330_issue_pending()
when using a large number of small buffers and dmatest.noverify=1. This
one was some obvious low-hanging fruit.


Sorry, that part of code has changed a lot since I wrote the driver,
so more details will help me.


Here's the actual splat:

[  220.649267] Unable to handle kernel paging request at virtual address 
100341338

[  220.666576] pgd = ff8008dcb000
[  220.679805] [100341338] *pgd=, *pud=
[  220.696116] Internal error: Oops: 9605 [#1] PREEMPT SMP
[  220.711579] Modules linked in:
[  220.724459] CPU: 3 PID: 1249 Comm: dma0chan5-copy0 Not tainted 
4.6.0-rc4+ #527

[  220.741652] Hardware name: ARM Juno development board (r1) (DT)
[  220.757512] task: ffc976255400 ti: ffc975aa4000 task.ti: 
ffc975aa4000

[  220.774979] PC is at pl330_tx_submit+0x68/0x108
[  220.789412] LR is at pl330_tx_submit+0x2c/0x108
[  220.803704] pc : [] lr : [] 
pstate: 81c5

[  220.821078] sp : ffc975aa7c90
[  220.834236] x29: ffc975aa7c90 x28: ffc975e81800
[  220.849303] x27: ffc97626c018 x26: ff8008cd9000
[  220.864314] x25:  x24: 
[  220.879300] x23: ff8008d8d410 x22: ff8008ce52f0
[  220.894266] x21: ffc976220d18 x20: ffc079138610
[  220.909139] x19: ffc976220c60 x18: 0010
[  220.923857] x17: 0007 

Re: [PATCH] dmaengine: pl330: Fix race in pl330_get_desc()

2016-04-27 Thread Robin Murphy

Hi Jassi,

On 27/04/16 04:26, Jassi Brar wrote:

On Tue, Apr 26, 2016 at 10:00 PM, Robin Murphy  wrote:

The current logic in pl330_get_desc() contains a clear race condition,
whereby if the descriptor pool is empty, we will create a new
descriptor, add it to the pool with the lock held, *release the lock*,
then try to remove it from the pool again.

Furthermore, if that second attempt then finds the pool empty because
someone else got there first, we conclude that something must have gone
terribly wrong and it's the absolute end of the world.


We may retry or simply increase the number of descriptors allocated in
a batch from 1 to, say, NR_DEFAULT_DESC.

diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c
index 372b435..7179a4d 100644
--- a/drivers/dma/pl330.c
+++ b/drivers/dma/pl330.c
@@ -2449,7 +2449,7 @@ static struct dma_pl330_desc
*pl330_get_desc(struct dma_pl330_chan *pch)

 /* If the DMAC pool is empty, alloc new */
 if (!desc) {
-   if (!add_desc(pl330, GFP_ATOMIC, 1))
+   if (!add_desc(pl330, GFP_ATOMIC, NR_DEFAULT_DESC))
 return NULL;

 /* Try again */


I'm not sure I see how leaving the obvious race in place and just making 
it n times harder to hit is in any way not a terrible idea compared to 
actually fixing it.



...
[  709.328891] dma-pl330 7ff0.dma: pl330_get_desc:2459 ALERT!
** 10 printk messages dropped ** [  709.352280] dma-pl330 7ff0.dma: 
__pl330_prep_dma_memcpy:2493 Unable to fetch desc
** 11 printk messages dropped ** [  709.372930] dma-pl330 7ff0.dma: 
pl330_get_desc:2459 ALERT!
** 2 printk messages dropped ** [  709.372953] dma-pl330 7ff0.dma: 
pl330_get_desc:2459 ALERT!
** 8 printk messages dropped ** [  709.374157] dma-pl330 7ff0.dma: 
__pl330_prep_dma_memcpy:2493 Unable to fetch desc
** 41 printk messages dropped ** [  709.393095] dmatest: dma0chan4-copy1: 
result #4545: 'prep error' with src_off=0x3a32 dst_off=0x46dd len=0xc5c3 (0)
...

Down with this sort of thing! The most sensible thing to do is avoid the
allocate-add-remove dance entirely and simply use the freshly-allocated
descriptor straight away.


... but you still try to first pluck from the list.


Well, that's the point of having a pool, right? Try to reuse a free 
descriptor if there is one, otherwise expand the pool. It's just that in 
the latter case, race or not, it's absolutely pointless to add the new 
descriptor to the pool only to immediately remove it again, when you can 
simply start using it instead - either way there's still an extra 
descriptor released back to the pool once we're done with it.



Instead of churning the code, I would suggest either check in a loop
that we have a desc OR allocate 2 or NR_DEFAULT_DESC descriptors
there. Probably we get more descriptors at the same cost of memory.


Having had a quick look into how beneficial that might be, I discover 
that what's provoking the add_desc() race is something else causing 
desc_pool->head to get wedged pointing at itself, so list_empty() 
forever returns true and we kick off a storm of allocations while 
desc_pool->tail fills up with unused descriptors. Looks like I'm going 
to have to set aside some time to properly debug and fix this driver if 
I want to use it for stress-testing IOMMU code...




I'm also seeing what looks like another occasional race under the same
conditions where pl330_tx_submit() blows up from dma_cookie_assign()
dereferencing a bogus tx->chan, but I think that's beyond my ability to
figure out right now. Similarly the storm of WARNs from pl330_issue_pending()
when using a large number of small buffers and dmatest.noverify=1. This
one was some obvious low-hanging fruit.


Sorry, that part of code has changed a lot since I wrote the driver,
so more details will help me.


Here's the actual splat:

[  220.649267] Unable to handle kernel paging request at virtual address 
100341338

[  220.666576] pgd = ff8008dcb000
[  220.679805] [100341338] *pgd=, *pud=
[  220.696116] Internal error: Oops: 9605 [#1] PREEMPT SMP
[  220.711579] Modules linked in:
[  220.724459] CPU: 3 PID: 1249 Comm: dma0chan5-copy0 Not tainted 
4.6.0-rc4+ #527

[  220.741652] Hardware name: ARM Juno development board (r1) (DT)
[  220.757512] task: ffc976255400 ti: ffc975aa4000 task.ti: 
ffc975aa4000

[  220.774979] PC is at pl330_tx_submit+0x68/0x108
[  220.789412] LR is at pl330_tx_submit+0x2c/0x108
[  220.803704] pc : [] lr : [] 
pstate: 81c5

[  220.821078] sp : ffc975aa7c90
[  220.834236] x29: ffc975aa7c90 x28: ffc975e81800
[  220.849303] x27: ffc97626c018 x26: ff8008cd9000
[  220.864314] x25:  x24: 
[  220.879300] x23: ff8008d8d410 x22: ff8008ce52f0
[  220.894266] x21: ffc976220d18 x20: ffc079138610
[  220.909139] x19: ffc976220c60 x18: 0010
[  220.923857] x17: 0007 x16: 0001
[  

Re: [PATCH] dmaengine: pl330: Fix race in pl330_get_desc()

2016-04-26 Thread Jassi Brar
On Tue, Apr 26, 2016 at 10:00 PM, Robin Murphy  wrote:
> The current logic in pl330_get_desc() contains a clear race condition,
> whereby if the descriptor pool is empty, we will create a new
> descriptor, add it to the pool with the lock held, *release the lock*,
> then try to remove it from the pool again.
>
> Furthermore, if that second attempt then finds the pool empty because
> someone else got there first, we conclude that something must have gone
> terribly wrong and it's the absolute end of the world.
>
We may retry or simply increase the number of descriptors allocated in
a batch from 1 to, say, NR_DEFAULT_DESC.

diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c
index 372b435..7179a4d 100644
--- a/drivers/dma/pl330.c
+++ b/drivers/dma/pl330.c
@@ -2449,7 +2449,7 @@ static struct dma_pl330_desc
*pl330_get_desc(struct dma_pl330_chan *pch)

/* If the DMAC pool is empty, alloc new */
if (!desc) {
-   if (!add_desc(pl330, GFP_ATOMIC, 1))
+   if (!add_desc(pl330, GFP_ATOMIC, NR_DEFAULT_DESC))
return NULL;

/* Try again */

> ...
> [  709.328891] dma-pl330 7ff0.dma: pl330_get_desc:2459 ALERT!
> ** 10 printk messages dropped ** [  709.352280] dma-pl330 7ff0.dma: 
> __pl330_prep_dma_memcpy:2493 Unable to fetch desc
> ** 11 printk messages dropped ** [  709.372930] dma-pl330 7ff0.dma: 
> pl330_get_desc:2459 ALERT!
> ** 2 printk messages dropped ** [  709.372953] dma-pl330 7ff0.dma: 
> pl330_get_desc:2459 ALERT!
> ** 8 printk messages dropped ** [  709.374157] dma-pl330 7ff0.dma: 
> __pl330_prep_dma_memcpy:2493 Unable to fetch desc
> ** 41 printk messages dropped ** [  709.393095] dmatest: dma0chan4-copy1: 
> result #4545: 'prep error' with src_off=0x3a32 dst_off=0x46dd len=0xc5c3 (0)
> ...
>
> Down with this sort of thing! The most sensible thing to do is avoid the
> allocate-add-remove dance entirely and simply use the freshly-allocated
> descriptor straight away.
>
... but you still try to first pluck from the list.
Instead of churning the code, I would suggest either check in a loop
that we have a desc OR allocate 2 or NR_DEFAULT_DESC descriptors
there. Probably we get more descriptors at the same cost of memory.

>
> I'm also seeing what looks like another occasional race under the same
> conditions where pl330_tx_submit() blows up from dma_cookie_assign()
> dereferencing a bogus tx->chan, but I think that's beyond my ability to
> figure out right now. Similarly the storm of WARNs from pl330_issue_pending()
> when using a large number of small buffers and dmatest.noverify=1. This
> one was some obvious low-hanging fruit.
>
Sorry, that part of code has changed a lot since I wrote the driver,
so more details will help me.

Thanks.


Re: [PATCH] dmaengine: pl330: Fix race in pl330_get_desc()

2016-04-26 Thread Jassi Brar
On Tue, Apr 26, 2016 at 10:00 PM, Robin Murphy  wrote:
> The current logic in pl330_get_desc() contains a clear race condition,
> whereby if the descriptor pool is empty, we will create a new
> descriptor, add it to the pool with the lock held, *release the lock*,
> then try to remove it from the pool again.
>
> Furthermore, if that second attempt then finds the pool empty because
> someone else got there first, we conclude that something must have gone
> terribly wrong and it's the absolute end of the world.
>
We may retry or simply increase the number of descriptors allocated in
a batch from 1 to, say, NR_DEFAULT_DESC.

diff --git a/drivers/dma/pl330.c b/drivers/dma/pl330.c
index 372b435..7179a4d 100644
--- a/drivers/dma/pl330.c
+++ b/drivers/dma/pl330.c
@@ -2449,7 +2449,7 @@ static struct dma_pl330_desc
*pl330_get_desc(struct dma_pl330_chan *pch)

/* If the DMAC pool is empty, alloc new */
if (!desc) {
-   if (!add_desc(pl330, GFP_ATOMIC, 1))
+   if (!add_desc(pl330, GFP_ATOMIC, NR_DEFAULT_DESC))
return NULL;

/* Try again */

> ...
> [  709.328891] dma-pl330 7ff0.dma: pl330_get_desc:2459 ALERT!
> ** 10 printk messages dropped ** [  709.352280] dma-pl330 7ff0.dma: 
> __pl330_prep_dma_memcpy:2493 Unable to fetch desc
> ** 11 printk messages dropped ** [  709.372930] dma-pl330 7ff0.dma: 
> pl330_get_desc:2459 ALERT!
> ** 2 printk messages dropped ** [  709.372953] dma-pl330 7ff0.dma: 
> pl330_get_desc:2459 ALERT!
> ** 8 printk messages dropped ** [  709.374157] dma-pl330 7ff0.dma: 
> __pl330_prep_dma_memcpy:2493 Unable to fetch desc
> ** 41 printk messages dropped ** [  709.393095] dmatest: dma0chan4-copy1: 
> result #4545: 'prep error' with src_off=0x3a32 dst_off=0x46dd len=0xc5c3 (0)
> ...
>
> Down with this sort of thing! The most sensible thing to do is avoid the
> allocate-add-remove dance entirely and simply use the freshly-allocated
> descriptor straight away.
>
... but you still try to first pluck from the list.
Instead of churning the code, I would suggest either check in a loop
that we have a desc OR allocate 2 or NR_DEFAULT_DESC descriptors
there. Probably we get more descriptors at the same cost of memory.

>
> I'm also seeing what looks like another occasional race under the same
> conditions where pl330_tx_submit() blows up from dma_cookie_assign()
> dereferencing a bogus tx->chan, but I think that's beyond my ability to
> figure out right now. Similarly the storm of WARNs from pl330_issue_pending()
> when using a large number of small buffers and dmatest.noverify=1. This
> one was some obvious low-hanging fruit.
>
Sorry, that part of code has changed a lot since I wrote the driver,
so more details will help me.

Thanks.