Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-20 Thread Konrad Rzeszutek Wilk
On Sun, Apr 19, 2015 at 05:43:18PM +0200, Dorian Gray wrote:
> I think the case is closed.
> Now that I know it's not USB, but wireless driver, I looked through
> the new k3.19.5's changelog and saw this:
> 
> 
> commit b943e69d33fac1e5f6db57868e061096b0aae67a
> Author: Larry Finger 
> Date:   Sat Mar 21 15:16:05 2015 -0500
> 
> rtlwifi: Fix IOMMU mapping leak in AP mode
> 
> commit be0b5e635883678bfbc695889772fed545f3427d upstream.
> 
> Transmission of an AP beacon does not call the TX interrupt service 
> routine,
> which usually does the cleanup. Instead, cleanup is handled in a tasklet
> completion routine. Unfortunately, this routine has a serious bug
> in that it does
> not release the DMA mapping before it frees the skb, thus one
> IOMMU mapping is
> leaked for each beacon. The test system failed with no free IOMMU
> mapping slots
> approximately one hour after hostapd was used to start an AP.
> 
> This issue was reported and tested at
> https://github.com/lwfinger/rtlwifi_new/issues/30.
> 
> Reported-and-tested-by: Kevin Mullican 
> Cc: Kevin Mullican 
> Signed-off-by: Shao Fu 
> Signed-off-by: Larry Finger 
> Signed-off-by: Kalle Valo 
> Signed-off-by: Greg Kroah-Hartman 
> 
> 
> Looks very related, especially because my wireless card is also always
> in AP mode, however I haven't been actually using it lately, so
> probably that's why I didn't notice anything related to it (and kept
> focused on USB), until I used dump_dma.
> 
> Well, due to my minimal knowledge regarding kernel's internals I can't
> be 100% sure that this was it, but so far 3.19.5 is working stable
> (uptime 6hrs and counting).

Sweet!
> 
> Thank you Konrad (and everyone else involved) for helping me out to
> pinpoint the actual culprit.

Sure thing. Happy to have been able to help!
> Jake
> 
> 
> On 18 April 2015 at 21:59, Dorian Gray  wrote:
> > On 18 April 2015 at 12:10, Dorian Gray  wrote:
> >> On 17 April 2015 at 22:06, Konrad Rzeszutek Wilk  
> >> wrote:
> >>> On Fri, Apr 17, 2015 at 05:14:20PM +0200, Dorian Gray wrote:
>  On 16 April 2015 at 20:42, Konrad Rzeszutek Wilk 
>   wrote:
>  > And easier way is to compile the kernel with CONFIG_DMA_API_DEBUG
>  > and then load the attached module.
>  >
>  > That should tell you who and what else is holding on the buffers.
> 
>  Ok, I have compiled 3.19.4 w/ CONFIG_DMA_API_DEBUG=y + the module you 
>  sent me.
>  Now, I'm not sure if I've done it right - I waited until the error
>  occured and then modprobe'd dump_dma.
>  I have attached the kernel log, but it tells me not much, if anything...
> >>>
> >>> The network driver is quite hungry for DMA. Did it do the same thing
> >>> in the earlier kernels?
> >>>
> >>> Thanks.
> 
>  Thanks again.
>  Jake
> >>>
> >>>
> >>
> >> Yeah, you're right:
> >>
> >> # grep rtl8192se dump_dma_k3.19.4.log | wc -l
> >> 6789
> >> #
> >> # grep rtl8192se dump_dma_k3.17.8.log | wc -l
> >> 162
> >> #
> >>
> >> So, wlan driver would be the real culprit then..?
> >> I would have never thought...
> >>
> >> I guess I'm gonna test 3.19.4 once more (just to be sure) with
> >> rtl8192se removed and see what happens.
> >>
> >> Thanks!
> >> Jake
> >
> >
> > [update]
> >
> > Ok, 6 hours of uptime (3.19.4 + blacklisted rtl8192se) and everything
> > was fine...
> > However, I was checking periodically and noticed that 'radeon' also
> > tends to grow continuously over time, whereas ethernet driver sticks
> > to, more or less, the same range:
> >
> > # uname -r
> > 3.19.4
> > #
> > # grep -Eo 'radeon|r8169' L1.log | sort | uniq -c
> >  62 r8169
> >4183 radeon
> > #
> > # grep -Eo 'radeon|r8169' L2.log | sort | uniq -c
> >  33 r8169
> >5582 radeon
> > #
> > # grep -Eo 'radeon|r8169' L3.log | sort | uniq -c
> >  54 r8169
> >7007 radeon
> > #
> > # grep -Eo 'radeon|r8169' L4.log | sort | uniq -c
> >  49 r8169
> >7429 radeon
> > #
> > # grep -Eo 'radeon|r8169' L5.log | sort | uniq -c
> >  34 r8169
> >9360 radeon
> > #
> >
> > It doesn't grow that much in 3.17.8:
> >
> > # uname -r
> > 3.17.8
> > #
> > # grep -Eo 'radeon|r8169|rtl8192se' L1.log | sort | uniq -c
> > 265 r8169
> >1229 radeon
> > 142 rtl8192se
> > #
> > # grep -Eo 'radeon|r8169|rtl8192se' L2.log | sort | uniq -c
> > 187 r8169
> >3159 radeon
> > 124 rtl8192se
> > #
> > # grep -Eo 'radeon|r8169|rtl8192se' L3.log | sort | uniq -c
> >  41 r8169
> >1894 radeon
> >  39 rtl8192se
> > #
> > # grep -Eo 'radeon|r8169|rtl8192se' L4.log | sort | uniq -c
> >  64 r8169
> >3370 radeon
> >  77 rtl8192se
> > #
> > # grep -Eo 'radeon|r8169|rtl8192se' L5.log | sort | uniq -c
> >  52 r8169
> >2597 radeon
> >  49 rtl8192se
> > #
> >
> >
> > Btw, at some point (3.19.4) I encounetered this:
> > [21631.181909] DMA-API: debugging out of memory - disabling
> >
> > Jake
--
To unsubscribe from this list: send the line 

Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-20 Thread Konrad Rzeszutek Wilk
On Sun, Apr 19, 2015 at 05:43:18PM +0200, Dorian Gray wrote:
 I think the case is closed.
 Now that I know it's not USB, but wireless driver, I looked through
 the new k3.19.5's changelog and saw this:
 
 
 commit b943e69d33fac1e5f6db57868e061096b0aae67a
 Author: Larry Finger larry.fin...@lwfinger.net
 Date:   Sat Mar 21 15:16:05 2015 -0500
 
 rtlwifi: Fix IOMMU mapping leak in AP mode
 
 commit be0b5e635883678bfbc695889772fed545f3427d upstream.
 
 Transmission of an AP beacon does not call the TX interrupt service 
 routine,
 which usually does the cleanup. Instead, cleanup is handled in a tasklet
 completion routine. Unfortunately, this routine has a serious bug
 in that it does
 not release the DMA mapping before it frees the skb, thus one
 IOMMU mapping is
 leaked for each beacon. The test system failed with no free IOMMU
 mapping slots
 approximately one hour after hostapd was used to start an AP.
 
 This issue was reported and tested at
 https://github.com/lwfinger/rtlwifi_new/issues/30.
 
 Reported-and-tested-by: Kevin Mullican ke...@mullican.com
 Cc: Kevin Mullican ke...@mullican.com
 Signed-off-by: Shao Fu sha...@realtek.com
 Signed-off-by: Larry Finger larry.fin...@lwfinger.net
 Signed-off-by: Kalle Valo kv...@codeaurora.org
 Signed-off-by: Greg Kroah-Hartman gre...@linuxfoundation.org
 
 
 Looks very related, especially because my wireless card is also always
 in AP mode, however I haven't been actually using it lately, so
 probably that's why I didn't notice anything related to it (and kept
 focused on USB), until I used dump_dma.
 
 Well, due to my minimal knowledge regarding kernel's internals I can't
 be 100% sure that this was it, but so far 3.19.5 is working stable
 (uptime 6hrs and counting).

Sweet!
 
 Thank you Konrad (and everyone else involved) for helping me out to
 pinpoint the actual culprit.

Sure thing. Happy to have been able to help!
 Jake
 
 
 On 18 April 2015 at 21:59, Dorian Gray yourfavourite...@gmail.com wrote:
  On 18 April 2015 at 12:10, Dorian Gray yourfavourite...@gmail.com wrote:
  On 17 April 2015 at 22:06, Konrad Rzeszutek Wilk konrad.w...@oracle.com 
  wrote:
  On Fri, Apr 17, 2015 at 05:14:20PM +0200, Dorian Gray wrote:
  On 16 April 2015 at 20:42, Konrad Rzeszutek Wilk 
  konrad.w...@oracle.com wrote:
   And easier way is to compile the kernel with CONFIG_DMA_API_DEBUG
   and then load the attached module.
  
   That should tell you who and what else is holding on the buffers.
 
  Ok, I have compiled 3.19.4 w/ CONFIG_DMA_API_DEBUG=y + the module you 
  sent me.
  Now, I'm not sure if I've done it right - I waited until the error
  occured and then modprobe'd dump_dma.
  I have attached the kernel log, but it tells me not much, if anything...
 
  The network driver is quite hungry for DMA. Did it do the same thing
  in the earlier kernels?
 
  Thanks.
 
  Thanks again.
  Jake
 
 
 
  Yeah, you're right:
 
  # grep rtl8192se dump_dma_k3.19.4.log | wc -l
  6789
  #
  # grep rtl8192se dump_dma_k3.17.8.log | wc -l
  162
  #
 
  So, wlan driver would be the real culprit then..?
  I would have never thought...
 
  I guess I'm gonna test 3.19.4 once more (just to be sure) with
  rtl8192se removed and see what happens.
 
  Thanks!
  Jake
 
 
  [update]
 
  Ok, 6 hours of uptime (3.19.4 + blacklisted rtl8192se) and everything
  was fine...
  However, I was checking periodically and noticed that 'radeon' also
  tends to grow continuously over time, whereas ethernet driver sticks
  to, more or less, the same range:
 
  # uname -r
  3.19.4
  #
  # grep -Eo 'radeon|r8169' L1.log | sort | uniq -c
   62 r8169
 4183 radeon
  #
  # grep -Eo 'radeon|r8169' L2.log | sort | uniq -c
   33 r8169
 5582 radeon
  #
  # grep -Eo 'radeon|r8169' L3.log | sort | uniq -c
   54 r8169
 7007 radeon
  #
  # grep -Eo 'radeon|r8169' L4.log | sort | uniq -c
   49 r8169
 7429 radeon
  #
  # grep -Eo 'radeon|r8169' L5.log | sort | uniq -c
   34 r8169
 9360 radeon
  #
 
  It doesn't grow that much in 3.17.8:
 
  # uname -r
  3.17.8
  #
  # grep -Eo 'radeon|r8169|rtl8192se' L1.log | sort | uniq -c
  265 r8169
 1229 radeon
  142 rtl8192se
  #
  # grep -Eo 'radeon|r8169|rtl8192se' L2.log | sort | uniq -c
  187 r8169
 3159 radeon
  124 rtl8192se
  #
  # grep -Eo 'radeon|r8169|rtl8192se' L3.log | sort | uniq -c
   41 r8169
 1894 radeon
   39 rtl8192se
  #
  # grep -Eo 'radeon|r8169|rtl8192se' L4.log | sort | uniq -c
   64 r8169
 3370 radeon
   77 rtl8192se
  #
  # grep -Eo 'radeon|r8169|rtl8192se' L5.log | sort | uniq -c
   52 r8169
 2597 radeon
   49 rtl8192se
  #
 
 
  Btw, at some point (3.19.4) I encounetered this:
  [21631.181909] DMA-API: debugging out of memory - disabling
 
  Jake
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-19 Thread Dorian Gray
I think the case is closed.
Now that I know it's not USB, but wireless driver, I looked through
the new k3.19.5's changelog and saw this:


commit b943e69d33fac1e5f6db57868e061096b0aae67a
Author: Larry Finger 
Date:   Sat Mar 21 15:16:05 2015 -0500

rtlwifi: Fix IOMMU mapping leak in AP mode

commit be0b5e635883678bfbc695889772fed545f3427d upstream.

Transmission of an AP beacon does not call the TX interrupt service routine,
which usually does the cleanup. Instead, cleanup is handled in a tasklet
completion routine. Unfortunately, this routine has a serious bug
in that it does
not release the DMA mapping before it frees the skb, thus one
IOMMU mapping is
leaked for each beacon. The test system failed with no free IOMMU
mapping slots
approximately one hour after hostapd was used to start an AP.

This issue was reported and tested at
https://github.com/lwfinger/rtlwifi_new/issues/30.

Reported-and-tested-by: Kevin Mullican 
Cc: Kevin Mullican 
Signed-off-by: Shao Fu 
Signed-off-by: Larry Finger 
Signed-off-by: Kalle Valo 
Signed-off-by: Greg Kroah-Hartman 


Looks very related, especially because my wireless card is also always
in AP mode, however I haven't been actually using it lately, so
probably that's why I didn't notice anything related to it (and kept
focused on USB), until I used dump_dma.

Well, due to my minimal knowledge regarding kernel's internals I can't
be 100% sure that this was it, but so far 3.19.5 is working stable
(uptime 6hrs and counting).

Thank you Konrad (and everyone else involved) for helping me out to
pinpoint the actual culprit.
Jake


On 18 April 2015 at 21:59, Dorian Gray  wrote:
> On 18 April 2015 at 12:10, Dorian Gray  wrote:
>> On 17 April 2015 at 22:06, Konrad Rzeszutek Wilk  
>> wrote:
>>> On Fri, Apr 17, 2015 at 05:14:20PM +0200, Dorian Gray wrote:
 On 16 April 2015 at 20:42, Konrad Rzeszutek Wilk  
 wrote:
 > And easier way is to compile the kernel with CONFIG_DMA_API_DEBUG
 > and then load the attached module.
 >
 > That should tell you who and what else is holding on the buffers.

 Ok, I have compiled 3.19.4 w/ CONFIG_DMA_API_DEBUG=y + the module you sent 
 me.
 Now, I'm not sure if I've done it right - I waited until the error
 occured and then modprobe'd dump_dma.
 I have attached the kernel log, but it tells me not much, if anything...
>>>
>>> The network driver is quite hungry for DMA. Did it do the same thing
>>> in the earlier kernels?
>>>
>>> Thanks.

 Thanks again.
 Jake
>>>
>>>
>>
>> Yeah, you're right:
>>
>> # grep rtl8192se dump_dma_k3.19.4.log | wc -l
>> 6789
>> #
>> # grep rtl8192se dump_dma_k3.17.8.log | wc -l
>> 162
>> #
>>
>> So, wlan driver would be the real culprit then..?
>> I would have never thought...
>>
>> I guess I'm gonna test 3.19.4 once more (just to be sure) with
>> rtl8192se removed and see what happens.
>>
>> Thanks!
>> Jake
>
>
> [update]
>
> Ok, 6 hours of uptime (3.19.4 + blacklisted rtl8192se) and everything
> was fine...
> However, I was checking periodically and noticed that 'radeon' also
> tends to grow continuously over time, whereas ethernet driver sticks
> to, more or less, the same range:
>
> # uname -r
> 3.19.4
> #
> # grep -Eo 'radeon|r8169' L1.log | sort | uniq -c
>  62 r8169
>4183 radeon
> #
> # grep -Eo 'radeon|r8169' L2.log | sort | uniq -c
>  33 r8169
>5582 radeon
> #
> # grep -Eo 'radeon|r8169' L3.log | sort | uniq -c
>  54 r8169
>7007 radeon
> #
> # grep -Eo 'radeon|r8169' L4.log | sort | uniq -c
>  49 r8169
>7429 radeon
> #
> # grep -Eo 'radeon|r8169' L5.log | sort | uniq -c
>  34 r8169
>9360 radeon
> #
>
> It doesn't grow that much in 3.17.8:
>
> # uname -r
> 3.17.8
> #
> # grep -Eo 'radeon|r8169|rtl8192se' L1.log | sort | uniq -c
> 265 r8169
>1229 radeon
> 142 rtl8192se
> #
> # grep -Eo 'radeon|r8169|rtl8192se' L2.log | sort | uniq -c
> 187 r8169
>3159 radeon
> 124 rtl8192se
> #
> # grep -Eo 'radeon|r8169|rtl8192se' L3.log | sort | uniq -c
>  41 r8169
>1894 radeon
>  39 rtl8192se
> #
> # grep -Eo 'radeon|r8169|rtl8192se' L4.log | sort | uniq -c
>  64 r8169
>3370 radeon
>  77 rtl8192se
> #
> # grep -Eo 'radeon|r8169|rtl8192se' L5.log | sort | uniq -c
>  52 r8169
>2597 radeon
>  49 rtl8192se
> #
>
>
> Btw, at some point (3.19.4) I encounetered this:
> [21631.181909] DMA-API: debugging out of memory - disabling
>
> Jake
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-19 Thread Dorian Gray
I think the case is closed.
Now that I know it's not USB, but wireless driver, I looked through
the new k3.19.5's changelog and saw this:


commit b943e69d33fac1e5f6db57868e061096b0aae67a
Author: Larry Finger larry.fin...@lwfinger.net
Date:   Sat Mar 21 15:16:05 2015 -0500

rtlwifi: Fix IOMMU mapping leak in AP mode

commit be0b5e635883678bfbc695889772fed545f3427d upstream.

Transmission of an AP beacon does not call the TX interrupt service routine,
which usually does the cleanup. Instead, cleanup is handled in a tasklet
completion routine. Unfortunately, this routine has a serious bug
in that it does
not release the DMA mapping before it frees the skb, thus one
IOMMU mapping is
leaked for each beacon. The test system failed with no free IOMMU
mapping slots
approximately one hour after hostapd was used to start an AP.

This issue was reported and tested at
https://github.com/lwfinger/rtlwifi_new/issues/30.

Reported-and-tested-by: Kevin Mullican ke...@mullican.com
Cc: Kevin Mullican ke...@mullican.com
Signed-off-by: Shao Fu sha...@realtek.com
Signed-off-by: Larry Finger larry.fin...@lwfinger.net
Signed-off-by: Kalle Valo kv...@codeaurora.org
Signed-off-by: Greg Kroah-Hartman gre...@linuxfoundation.org


Looks very related, especially because my wireless card is also always
in AP mode, however I haven't been actually using it lately, so
probably that's why I didn't notice anything related to it (and kept
focused on USB), until I used dump_dma.

Well, due to my minimal knowledge regarding kernel's internals I can't
be 100% sure that this was it, but so far 3.19.5 is working stable
(uptime 6hrs and counting).

Thank you Konrad (and everyone else involved) for helping me out to
pinpoint the actual culprit.
Jake


On 18 April 2015 at 21:59, Dorian Gray yourfavourite...@gmail.com wrote:
 On 18 April 2015 at 12:10, Dorian Gray yourfavourite...@gmail.com wrote:
 On 17 April 2015 at 22:06, Konrad Rzeszutek Wilk konrad.w...@oracle.com 
 wrote:
 On Fri, Apr 17, 2015 at 05:14:20PM +0200, Dorian Gray wrote:
 On 16 April 2015 at 20:42, Konrad Rzeszutek Wilk konrad.w...@oracle.com 
 wrote:
  And easier way is to compile the kernel with CONFIG_DMA_API_DEBUG
  and then load the attached module.
 
  That should tell you who and what else is holding on the buffers.

 Ok, I have compiled 3.19.4 w/ CONFIG_DMA_API_DEBUG=y + the module you sent 
 me.
 Now, I'm not sure if I've done it right - I waited until the error
 occured and then modprobe'd dump_dma.
 I have attached the kernel log, but it tells me not much, if anything...

 The network driver is quite hungry for DMA. Did it do the same thing
 in the earlier kernels?

 Thanks.

 Thanks again.
 Jake



 Yeah, you're right:

 # grep rtl8192se dump_dma_k3.19.4.log | wc -l
 6789
 #
 # grep rtl8192se dump_dma_k3.17.8.log | wc -l
 162
 #

 So, wlan driver would be the real culprit then..?
 I would have never thought...

 I guess I'm gonna test 3.19.4 once more (just to be sure) with
 rtl8192se removed and see what happens.

 Thanks!
 Jake


 [update]

 Ok, 6 hours of uptime (3.19.4 + blacklisted rtl8192se) and everything
 was fine...
 However, I was checking periodically and noticed that 'radeon' also
 tends to grow continuously over time, whereas ethernet driver sticks
 to, more or less, the same range:

 # uname -r
 3.19.4
 #
 # grep -Eo 'radeon|r8169' L1.log | sort | uniq -c
  62 r8169
4183 radeon
 #
 # grep -Eo 'radeon|r8169' L2.log | sort | uniq -c
  33 r8169
5582 radeon
 #
 # grep -Eo 'radeon|r8169' L3.log | sort | uniq -c
  54 r8169
7007 radeon
 #
 # grep -Eo 'radeon|r8169' L4.log | sort | uniq -c
  49 r8169
7429 radeon
 #
 # grep -Eo 'radeon|r8169' L5.log | sort | uniq -c
  34 r8169
9360 radeon
 #

 It doesn't grow that much in 3.17.8:

 # uname -r
 3.17.8
 #
 # grep -Eo 'radeon|r8169|rtl8192se' L1.log | sort | uniq -c
 265 r8169
1229 radeon
 142 rtl8192se
 #
 # grep -Eo 'radeon|r8169|rtl8192se' L2.log | sort | uniq -c
 187 r8169
3159 radeon
 124 rtl8192se
 #
 # grep -Eo 'radeon|r8169|rtl8192se' L3.log | sort | uniq -c
  41 r8169
1894 radeon
  39 rtl8192se
 #
 # grep -Eo 'radeon|r8169|rtl8192se' L4.log | sort | uniq -c
  64 r8169
3370 radeon
  77 rtl8192se
 #
 # grep -Eo 'radeon|r8169|rtl8192se' L5.log | sort | uniq -c
  52 r8169
2597 radeon
  49 rtl8192se
 #


 Btw, at some point (3.19.4) I encounetered this:
 [21631.181909] DMA-API: debugging out of memory - disabling

 Jake
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-18 Thread Dorian Gray
On 18 April 2015 at 12:10, Dorian Gray  wrote:
> On 17 April 2015 at 22:06, Konrad Rzeszutek Wilk  
> wrote:
>> On Fri, Apr 17, 2015 at 05:14:20PM +0200, Dorian Gray wrote:
>>> On 16 April 2015 at 20:42, Konrad Rzeszutek Wilk  
>>> wrote:
>>> > And easier way is to compile the kernel with CONFIG_DMA_API_DEBUG
>>> > and then load the attached module.
>>> >
>>> > That should tell you who and what else is holding on the buffers.
>>>
>>> Ok, I have compiled 3.19.4 w/ CONFIG_DMA_API_DEBUG=y + the module you sent 
>>> me.
>>> Now, I'm not sure if I've done it right - I waited until the error
>>> occured and then modprobe'd dump_dma.
>>> I have attached the kernel log, but it tells me not much, if anything...
>>
>> The network driver is quite hungry for DMA. Did it do the same thing
>> in the earlier kernels?
>>
>> Thanks.
>>>
>>> Thanks again.
>>> Jake
>>
>>
>
> Yeah, you're right:
>
> # grep rtl8192se dump_dma_k3.19.4.log | wc -l
> 6789
> #
> # grep rtl8192se dump_dma_k3.17.8.log | wc -l
> 162
> #
>
> So, wlan driver would be the real culprit then..?
> I would have never thought...
>
> I guess I'm gonna test 3.19.4 once more (just to be sure) with
> rtl8192se removed and see what happens.
>
> Thanks!
> Jake


[update]

Ok, 6 hours of uptime (3.19.4 + blacklisted rtl8192se) and everything
was fine...
However, I was checking periodically and noticed that 'radeon' also
tends to grow continuously over time, whereas ethernet driver sticks
to, more or less, the same range:

# uname -r
3.19.4
#
# grep -Eo 'radeon|r8169' L1.log | sort | uniq -c
 62 r8169
   4183 radeon
#
# grep -Eo 'radeon|r8169' L2.log | sort | uniq -c
 33 r8169
   5582 radeon
#
# grep -Eo 'radeon|r8169' L3.log | sort | uniq -c
 54 r8169
   7007 radeon
#
# grep -Eo 'radeon|r8169' L4.log | sort | uniq -c
 49 r8169
   7429 radeon
#
# grep -Eo 'radeon|r8169' L5.log | sort | uniq -c
 34 r8169
   9360 radeon
#

It doesn't grow that much in 3.17.8:

# uname -r
3.17.8
#
# grep -Eo 'radeon|r8169|rtl8192se' L1.log | sort | uniq -c
265 r8169
   1229 radeon
142 rtl8192se
#
# grep -Eo 'radeon|r8169|rtl8192se' L2.log | sort | uniq -c
187 r8169
   3159 radeon
124 rtl8192se
#
# grep -Eo 'radeon|r8169|rtl8192se' L3.log | sort | uniq -c
 41 r8169
   1894 radeon
 39 rtl8192se
#
# grep -Eo 'radeon|r8169|rtl8192se' L4.log | sort | uniq -c
 64 r8169
   3370 radeon
 77 rtl8192se
#
# grep -Eo 'radeon|r8169|rtl8192se' L5.log | sort | uniq -c
 52 r8169
   2597 radeon
 49 rtl8192se
#


Btw, at some point (3.19.4) I encounetered this:
[21631.181909] DMA-API: debugging out of memory - disabling

Jake
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-18 Thread Dorian Gray
On 17 April 2015 at 22:06, Konrad Rzeszutek Wilk  wrote:
> On Fri, Apr 17, 2015 at 05:14:20PM +0200, Dorian Gray wrote:
>> On 16 April 2015 at 20:42, Konrad Rzeszutek Wilk  
>> wrote:
>> > And easier way is to compile the kernel with CONFIG_DMA_API_DEBUG
>> > and then load the attached module.
>> >
>> > That should tell you who and what else is holding on the buffers.
>>
>> Ok, I have compiled 3.19.4 w/ CONFIG_DMA_API_DEBUG=y + the module you sent 
>> me.
>> Now, I'm not sure if I've done it right - I waited until the error
>> occured and then modprobe'd dump_dma.
>> I have attached the kernel log, but it tells me not much, if anything...
>
> The network driver is quite hungry for DMA. Did it do the same thing
> in the earlier kernels?
>
> Thanks.
>>
>> Thanks again.
>> Jake
>
>

Yeah, you're right:

# grep rtl8192se dump_dma_k3.19.4.log | wc -l
6789
#
# grep rtl8192se dump_dma_k3.17.8.log | wc -l
162
#

So, wlan driver would be the real culprit then..?
I would have never thought...

I guess I'm gonna test 3.19.4 once more (just to be sure) with
rtl8192se removed and see what happens.

Thanks!
Jake


dump_dma_logs.tar.bz2
Description: BZip2 compressed data


Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-18 Thread Dorian Gray
On 18 April 2015 at 12:10, Dorian Gray yourfavourite...@gmail.com wrote:
 On 17 April 2015 at 22:06, Konrad Rzeszutek Wilk konrad.w...@oracle.com 
 wrote:
 On Fri, Apr 17, 2015 at 05:14:20PM +0200, Dorian Gray wrote:
 On 16 April 2015 at 20:42, Konrad Rzeszutek Wilk konrad.w...@oracle.com 
 wrote:
  And easier way is to compile the kernel with CONFIG_DMA_API_DEBUG
  and then load the attached module.
 
  That should tell you who and what else is holding on the buffers.

 Ok, I have compiled 3.19.4 w/ CONFIG_DMA_API_DEBUG=y + the module you sent 
 me.
 Now, I'm not sure if I've done it right - I waited until the error
 occured and then modprobe'd dump_dma.
 I have attached the kernel log, but it tells me not much, if anything...

 The network driver is quite hungry for DMA. Did it do the same thing
 in the earlier kernels?

 Thanks.

 Thanks again.
 Jake



 Yeah, you're right:

 # grep rtl8192se dump_dma_k3.19.4.log | wc -l
 6789
 #
 # grep rtl8192se dump_dma_k3.17.8.log | wc -l
 162
 #

 So, wlan driver would be the real culprit then..?
 I would have never thought...

 I guess I'm gonna test 3.19.4 once more (just to be sure) with
 rtl8192se removed and see what happens.

 Thanks!
 Jake


[update]

Ok, 6 hours of uptime (3.19.4 + blacklisted rtl8192se) and everything
was fine...
However, I was checking periodically and noticed that 'radeon' also
tends to grow continuously over time, whereas ethernet driver sticks
to, more or less, the same range:

# uname -r
3.19.4
#
# grep -Eo 'radeon|r8169' L1.log | sort | uniq -c
 62 r8169
   4183 radeon
#
# grep -Eo 'radeon|r8169' L2.log | sort | uniq -c
 33 r8169
   5582 radeon
#
# grep -Eo 'radeon|r8169' L3.log | sort | uniq -c
 54 r8169
   7007 radeon
#
# grep -Eo 'radeon|r8169' L4.log | sort | uniq -c
 49 r8169
   7429 radeon
#
# grep -Eo 'radeon|r8169' L5.log | sort | uniq -c
 34 r8169
   9360 radeon
#

It doesn't grow that much in 3.17.8:

# uname -r
3.17.8
#
# grep -Eo 'radeon|r8169|rtl8192se' L1.log | sort | uniq -c
265 r8169
   1229 radeon
142 rtl8192se
#
# grep -Eo 'radeon|r8169|rtl8192se' L2.log | sort | uniq -c
187 r8169
   3159 radeon
124 rtl8192se
#
# grep -Eo 'radeon|r8169|rtl8192se' L3.log | sort | uniq -c
 41 r8169
   1894 radeon
 39 rtl8192se
#
# grep -Eo 'radeon|r8169|rtl8192se' L4.log | sort | uniq -c
 64 r8169
   3370 radeon
 77 rtl8192se
#
# grep -Eo 'radeon|r8169|rtl8192se' L5.log | sort | uniq -c
 52 r8169
   2597 radeon
 49 rtl8192se
#


Btw, at some point (3.19.4) I encounetered this:
[21631.181909] DMA-API: debugging out of memory - disabling

Jake
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-18 Thread Dorian Gray
On 17 April 2015 at 22:06, Konrad Rzeszutek Wilk konrad.w...@oracle.com wrote:
 On Fri, Apr 17, 2015 at 05:14:20PM +0200, Dorian Gray wrote:
 On 16 April 2015 at 20:42, Konrad Rzeszutek Wilk konrad.w...@oracle.com 
 wrote:
  And easier way is to compile the kernel with CONFIG_DMA_API_DEBUG
  and then load the attached module.
 
  That should tell you who and what else is holding on the buffers.

 Ok, I have compiled 3.19.4 w/ CONFIG_DMA_API_DEBUG=y + the module you sent 
 me.
 Now, I'm not sure if I've done it right - I waited until the error
 occured and then modprobe'd dump_dma.
 I have attached the kernel log, but it tells me not much, if anything...

 The network driver is quite hungry for DMA. Did it do the same thing
 in the earlier kernels?

 Thanks.

 Thanks again.
 Jake



Yeah, you're right:

# grep rtl8192se dump_dma_k3.19.4.log | wc -l
6789
#
# grep rtl8192se dump_dma_k3.17.8.log | wc -l
162
#

So, wlan driver would be the real culprit then..?
I would have never thought...

I guess I'm gonna test 3.19.4 once more (just to be sure) with
rtl8192se removed and see what happens.

Thanks!
Jake


dump_dma_logs.tar.bz2
Description: BZip2 compressed data


Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-17 Thread Konrad Rzeszutek Wilk
On Fri, Apr 17, 2015 at 05:14:20PM +0200, Dorian Gray wrote:
> On 16 April 2015 at 20:42, Konrad Rzeszutek Wilk  
> wrote:
> > And easier way is to compile the kernel with CONFIG_DMA_API_DEBUG
> > and then load the attached module.
> >
> > That should tell you who and what else is holding on the buffers.
> 
> Ok, I have compiled 3.19.4 w/ CONFIG_DMA_API_DEBUG=y + the module you sent me.
> Now, I'm not sure if I've done it right - I waited until the error
> occured and then modprobe'd dump_dma.
> I have attached the kernel log, but it tells me not much, if anything...

The network driver is quite hungry for DMA. Did it do the same thing
in the earlier kernels?

Thanks.
> 
> Thanks again.
> Jake


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-17 Thread Dorian Gray
On 16 April 2015 at 20:42, Konrad Rzeszutek Wilk  wrote:
> And easier way is to compile the kernel with CONFIG_DMA_API_DEBUG
> and then load the attached module.
>
> That should tell you who and what else is holding on the buffers.

Ok, I have compiled 3.19.4 w/ CONFIG_DMA_API_DEBUG=y + the module you sent me.
Now, I'm not sure if I've done it right - I waited until the error
occured and then modprobe'd dump_dma.
I have attached the kernel log, but it tells me not much, if anything...

Thanks again.
Jake


dump_dma.log.tar.bz2
Description: BZip2 compressed data


Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-17 Thread Dorian Gray
On 16 April 2015 at 18:57, Dorian Gray  wrote:
> On 16 April 2015 at 16:24, Suman Tripathi  wrote:
>> Try increasing the SWIOTLB size to 128MB .Default is 64MB.
>
> Ok, so I'm back to k3.18.7 (default in the latest Fatdog), although
> I'm not sure what should be the exact value of swiotlb boot param?
> Got totally mixed results from uncle Google - some says the unit is in
> MiB, some that it's 4k pages and another that 128MiB = 65536, so I
> played it safe and used swiotlb=131072.
> Is this correct?
> It may take a few days, but I'll let you know if it worked (or for how
> long, if not).

I was running 3.18.7 + swiotlb=131072 + 2 external drives plugged-in
and mounted for about 18 hours straight. The error didn't show up.

Well, I would run it a little longer, but I had to restart X and while
doing so, the system crashed for an unknown reason.

Anyway, this seems to be quite reliable workaround - at least I can
_use_ kernels newer than 3.17.8, because with that bug, popping up
after a couple of hours of uptime, it was a total show stopper to me.

Thanks!
Jake
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-17 Thread Dorian Gray
On 16 April 2015 at 18:57, Dorian Gray yourfavourite...@gmail.com wrote:
 On 16 April 2015 at 16:24, Suman Tripathi stripa...@apm.com wrote:
 Try increasing the SWIOTLB size to 128MB .Default is 64MB.

 Ok, so I'm back to k3.18.7 (default in the latest Fatdog), although
 I'm not sure what should be the exact value of swiotlb boot param?
 Got totally mixed results from uncle Google - some says the unit is in
 MiB, some that it's 4k pages and another that 128MiB = 65536, so I
 played it safe and used swiotlb=131072.
 Is this correct?
 It may take a few days, but I'll let you know if it worked (or for how
 long, if not).

I was running 3.18.7 + swiotlb=131072 + 2 external drives plugged-in
and mounted for about 18 hours straight. The error didn't show up.

Well, I would run it a little longer, but I had to restart X and while
doing so, the system crashed for an unknown reason.

Anyway, this seems to be quite reliable workaround - at least I can
_use_ kernels newer than 3.17.8, because with that bug, popping up
after a couple of hours of uptime, it was a total show stopper to me.

Thanks!
Jake
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-17 Thread Dorian Gray
On 16 April 2015 at 20:42, Konrad Rzeszutek Wilk konrad.w...@oracle.com wrote:
 And easier way is to compile the kernel with CONFIG_DMA_API_DEBUG
 and then load the attached module.

 That should tell you who and what else is holding on the buffers.

Ok, I have compiled 3.19.4 w/ CONFIG_DMA_API_DEBUG=y + the module you sent me.
Now, I'm not sure if I've done it right - I waited until the error
occured and then modprobe'd dump_dma.
I have attached the kernel log, but it tells me not much, if anything...

Thanks again.
Jake


dump_dma.log.tar.bz2
Description: BZip2 compressed data


Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-17 Thread Konrad Rzeszutek Wilk
On Fri, Apr 17, 2015 at 05:14:20PM +0200, Dorian Gray wrote:
 On 16 April 2015 at 20:42, Konrad Rzeszutek Wilk konrad.w...@oracle.com 
 wrote:
  And easier way is to compile the kernel with CONFIG_DMA_API_DEBUG
  and then load the attached module.
 
  That should tell you who and what else is holding on the buffers.
 
 Ok, I have compiled 3.19.4 w/ CONFIG_DMA_API_DEBUG=y + the module you sent me.
 Now, I'm not sure if I've done it right - I waited until the error
 occured and then modprobe'd dump_dma.
 I have attached the kernel log, but it tells me not much, if anything...

The network driver is quite hungry for DMA. Did it do the same thing
in the earlier kernels?

Thanks.
 
 Thanks again.
 Jake


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-16 Thread Dorian Gray
On 16 April 2015 at 20:42, Konrad Rzeszutek Wilk  wrote:
> And easier way is to compile the kernel with CONFIG_DMA_API_DEBUG
> and then load the attached module.
>
> That should tell you who and what else is holding on the buffers.

Thanks, this will be my next step then, right after I'm done with
testing the increased SWIOTLB.

Jake
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-16 Thread Konrad Rzeszutek Wilk
On Thu, Apr 16, 2015 at 06:57:46PM +0200, Dorian Gray wrote:
> On 16 April 2015 at 16:15, Alan Stern  wrote:
> > This appears to be a problem with the IOMMU or SWIOTLB subsystems, not
> > the USB subsystem.  I have CC'ed the appropriate mailing lists.
> 
> Thanks, I'm far from being a kernel expert, so was expecting it could
> be wrong subsection.
> 
> 
> 
> On 16 April 2015 at 16:24, Suman Tripathi  wrote:
> > Try increasing the SWIOTLB size to 128MB .Default is 64MB.
> 
> Ok, so I'm back to k3.18.7 (default in the latest Fatdog), although
> I'm not sure what should be the exact value of swiotlb boot param?
> Got totally mixed results from uncle Google - some says the unit is in
> MiB, some that it's 4k pages and another that 128MiB = 65536, so I
> played it safe and used swiotlb=131072.
> Is this correct?
> It may take a few days, but I'll let you know if it worked (or for how
> long, if not).
> 
> 
> 
> On 16 April 2015 at 16:54, Alexander Duyck  wrote:
> > More likely would be a device driver that is DMA mapping memory but not
> > unmapping it after it is done resulting in the bounce buffer pool being
> > depleted.
> > You might want dump the list of drivers loaded on the system with lsmod,
> > and then possibly look at doing a git bisect for something introduced
> > between 3.17 and 3.18 since that seems to be when you started seeing
> > this issue.
> 
> Ok, I'll (try to) look at this, but like I said - I'm not a kernel
> (nor git) expert.
> Anyway, I guess I'm gonna start with this:
> https://wiki.gentoo.org/wiki/Kernel_git-bisect
> Who knows...perhaps I'll find something...

And easier way is to compile the kernel with CONFIG_DMA_API_DEBUG
and then load the attached module.

That should tell you who and what else is holding on the buffers.


> 
> 
> 
> Thank you all for the replies.
> Jake
> ___
> iommu mailing list
> io...@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
/*
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License v2.0 as published by
 * the Free Software Foundation
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 */

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

#include 
#include 

#include 

#define DUMP_DMA_FUN  "0.1"

MODULE_AUTHOR("Konrad Rzeszutek Wilk ");
MODULE_DESCRIPTION("dump dma");
MODULE_LICENSE("GPL");
MODULE_VERSION(DUMP_DMA_FUN);

static int __init dump_dma_init(void)
{
debug_dma_dump_mappings(NULL);
return 0;
}

static void __exit dump_dma_exit(void)
{
}

module_init(dump_dma_init);
module_exit(dump_dma_exit);


Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-16 Thread Dorian Gray
On 16 April 2015 at 16:15, Alan Stern  wrote:
> This appears to be a problem with the IOMMU or SWIOTLB subsystems, not
> the USB subsystem.  I have CC'ed the appropriate mailing lists.

Thanks, I'm far from being a kernel expert, so was expecting it could
be wrong subsection.



On 16 April 2015 at 16:24, Suman Tripathi  wrote:
> Try increasing the SWIOTLB size to 128MB .Default is 64MB.

Ok, so I'm back to k3.18.7 (default in the latest Fatdog), although
I'm not sure what should be the exact value of swiotlb boot param?
Got totally mixed results from uncle Google - some says the unit is in
MiB, some that it's 4k pages and another that 128MiB = 65536, so I
played it safe and used swiotlb=131072.
Is this correct?
It may take a few days, but I'll let you know if it worked (or for how
long, if not).



On 16 April 2015 at 16:54, Alexander Duyck  wrote:
> More likely would be a device driver that is DMA mapping memory but not
> unmapping it after it is done resulting in the bounce buffer pool being
> depleted.
> You might want dump the list of drivers loaded on the system with lsmod,
> and then possibly look at doing a git bisect for something introduced
> between 3.17 and 3.18 since that seems to be when you started seeing
> this issue.

Ok, I'll (try to) look at this, but like I said - I'm not a kernel
(nor git) expert.
Anyway, I guess I'm gonna start with this:
https://wiki.gentoo.org/wiki/Kernel_git-bisect
Who knows...perhaps I'll find something...



Thank you all for the replies.
Jake
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-16 Thread Alexander Duyck
On 04/16/2015 07:15 AM, Alan Stern wrote:
> On Thu, 16 Apr 2015, Dorian Gray wrote:
>
>> I have tested the following kernel versions:
>> - 3.18.4, 3.18.6, 3.18.7, 3.19.4 [all affected]
>> - 3.17.1 [unaffected]
>> - 3.17.8 [probably the last unaffected version; I'm using it currently]
>>
>> Also, I've been using the very same configuration (hardware) along
>> with 2.6.x, 3.2.x, 3.4.x, 3.10.x and have never encountered such a
>> behavior before.
>>
>> And the problem is:
>>
>> When at least one external drive is plugged-in AND mounted, after ~2-4
>> hours the following occurs (@11315.681561):
>>
>> [ 5570.110523] usb 2-1.2: new high-speed USB device number 5 using ehci-pci
>> [ 5570.852917] usb 2-1.2: New USB device found, idVendor=1058, idProduct=0730
>> [ 5570.852923] usb 2-1.2: New USB device strings: Mfr=1, Product=2,
>> SerialNumber=3
>> [ 5570.852927] usb 2-1.2: Product: My Passport 0730
>> [ 5570.852930] usb 2-1.2: Manufacturer: Western Digital
>> [ 5570.852933] usb 2-1.2: SerialNumber:
>> [ 5570.853517] usb-storage 2-1.2:1.0: USB Mass Storage device detected
>> [ 5570.853691] scsi host8: usb-storage 2-1.2:1.0
>> [ 5572.932659] scsi 8:0:0:0: Direct-Access WD   My Passport
>> 0730 1012 PQ: 0 ANSI: 6
>> [ 5572.933013] sd 8:0:0:0: Attached scsi generic sg5 type 0
>> [ 5575.306801] scsi 8:0:0:1: Enclosure WD   SES Device
>>   1012 PQ: 0 ANSI: 6
>> [ 5575.307160] sd 8:0:0:0: [sdc] 976707584 512-byte logical blocks:
>> (500 GB/465 GiB)
>> [ 5575.308405] sd 8:0:0:0: [sdc] Write Protect is off
>> [ 5575.308416] sd 8:0:0:0: [sdc] Mode Sense: 47 00 10 08
>> [ 5575.309772] sd 8:0:0:0: [sdc] No Caching mode page found
>> [ 5575.309776] sd 8:0:0:0: [sdc] Assuming drive cache: write through
>> [ 5575.311176] scsi 8:0:0:1: Attached scsi generic sg6 type 13
>> [ 5575.328540]  sdc: sdc1
>> [ 5575.331026] sd 8:0:0:0: [sdc] Attached SCSI disk
>> [11315.681561] ehci-pci :00:1d.0: swiotlb buffer is full (sz: 32768 
>> bytes)
>> [11315.681565] DMA: Out of SW-IOMMU space for 32768 bytes at device 
>> :00:1d.0
>> [11315.681874] ehci-pci :00:1d.0: swiotlb buffer is full (sz: 32768 
>> bytes)
>> [11315.681876] DMA: Out of SW-IOMMU space for 32768 bytes at device 
>> :00:1d.0
>> [11315.682171] ehci-pci :00:1d.0: swiotlb buffer is full (sz: 32768 
>> bytes)
>> [11315.682174] DMA: Out of SW-IOMMU space for 32768 bytes at device 
>> :00:1d.0
>> [...and so on...]
> This appears to be a problem with the IOMMU or SWIOTLB subsystems, not
> the USB subsystem.  I have CC'ed the appropriate mailing lists.
>
> Alan Stern

More likely would be a device driver that is DMA mapping memory but not
unmapping it after it is done resulting in the bounce buffer pool being
depleted.

You might want dump the list of drivers loaded on the system with lsmod,
and then possibly look at doing a git bisect for something introduced
between 3.17 and 3.18 since that seems to be when you started seeing
this issue.

- Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-16 Thread Suman Tripathi
On Thu, 16 Apr 2015, Dorian Gray wrote:

> I have tested the following kernel versions:
> - 3.18.4, 3.18.6, 3.18.7, 3.19.4 [all affected]
> - 3.17.1 [unaffected]
> - 3.17.8 [probably the last unaffected version; I'm using it currently]
>
> Also, I've been using the very same configuration (hardware) along
> with 2.6.x, 3.2.x, 3.4.x, 3.10.x and have never encountered such a
> behavior before.
>
> And the problem is:
>
> When at least one external drive is plugged-in AND mounted, after ~2-4
> hours the following occurs (@11315.681561):
>
> [ 5570.110523] usb 2-1.2: new high-speed USB device number 5 using ehci-pci
> [ 5570.852917] usb 2-1.2: New USB device found, idVendor=1058, idProduct=0730
> [ 5570.852923] usb 2-1.2: New USB device strings: Mfr=1, Product=2,
> SerialNumber=3
> [ 5570.852927] usb 2-1.2: Product: My Passport 0730
> [ 5570.852930] usb 2-1.2: Manufacturer: Western Digital
> [ 5570.852933] usb 2-1.2: SerialNumber:
> [ 5570.853517] usb-storage 2-1.2:1.0: USB Mass Storage device detected
> [ 5570.853691] scsi host8: usb-storage 2-1.2:1.0
> [ 5572.932659] scsi 8:0:0:0: Direct-Access WD   My Passport
> 0730 1012 PQ: 0 ANSI: 6
> [ 5572.933013] sd 8:0:0:0: Attached scsi generic sg5 type 0
> [ 5575.306801] scsi 8:0:0:1: Enclosure WD   SES Device
>   1012 PQ: 0 ANSI: 6
> [ 5575.307160] sd 8:0:0:0: [sdc] 976707584 512-byte logical blocks:
> (500 GB/465 GiB)
> [ 5575.308405] sd 8:0:0:0: [sdc] Write Protect is off
> [ 5575.308416] sd 8:0:0:0: [sdc] Mode Sense: 47 00 10 08
> [ 5575.309772] sd 8:0:0:0: [sdc] No Caching mode page found
> [ 5575.309776] sd 8:0:0:0: [sdc] Assuming drive cache: write through
> [ 5575.311176] scsi 8:0:0:1: Attached scsi generic sg6 type 13
> [ 5575.328540]  sdc: sdc1
> [ 5575.331026] sd 8:0:0:0: [sdc] Attached SCSI disk
> [11315.681561] ehci-pci :00:1d.0: swiotlb buffer is full (sz: 32768 bytes)
> [11315.681565] DMA: Out of SW-IOMMU space for 32768 bytes at device 
> :00:1d.0
> [11315.681874] ehci-pci :00:1d.0: swiotlb buffer is full (sz: 32768 bytes)
> [11315.681876] DMA: Out of SW-IOMMU space for 32768 bytes at device 
> :00:1d.0
> [11315.682171] ehci-pci :00:1d.0: swiotlb buffer is full (sz: 32768 bytes)
> [11315.682174] DMA: Out of SW-IOMMU space for 32768 bytes at device 
> :00:1d.0
> [...and so on...]

This appears to be a problem with the IOMMU or SWIOTLB subsystems, not
the USB subsystem.  I have CC'ed the appropriate mailing lists.

Try increasing the SWIOTLB size to 128MB .Default is 64MB.

Alan Stern

On Thu, Apr 16, 2015 at 7:45 PM, Alan Stern  wrote:
> On Thu, 16 Apr 2015, Dorian Gray wrote:
>
>> I have tested the following kernel versions:
>> - 3.18.4, 3.18.6, 3.18.7, 3.19.4 [all affected]
>> - 3.17.1 [unaffected]
>> - 3.17.8 [probably the last unaffected version; I'm using it currently]
>>
>> Also, I've been using the very same configuration (hardware) along
>> with 2.6.x, 3.2.x, 3.4.x, 3.10.x and have never encountered such a
>> behavior before.
>>
>> And the problem is:
>>
>> When at least one external drive is plugged-in AND mounted, after ~2-4
>> hours the following occurs (@11315.681561):
>>
>> [ 5570.110523] usb 2-1.2: new high-speed USB device number 5 using ehci-pci
>> [ 5570.852917] usb 2-1.2: New USB device found, idVendor=1058, idProduct=0730
>> [ 5570.852923] usb 2-1.2: New USB device strings: Mfr=1, Product=2,
>> SerialNumber=3
>> [ 5570.852927] usb 2-1.2: Product: My Passport 0730
>> [ 5570.852930] usb 2-1.2: Manufacturer: Western Digital
>> [ 5570.852933] usb 2-1.2: SerialNumber:
>> [ 5570.853517] usb-storage 2-1.2:1.0: USB Mass Storage device detected
>> [ 5570.853691] scsi host8: usb-storage 2-1.2:1.0
>> [ 5572.932659] scsi 8:0:0:0: Direct-Access WD   My Passport
>> 0730 1012 PQ: 0 ANSI: 6
>> [ 5572.933013] sd 8:0:0:0: Attached scsi generic sg5 type 0
>> [ 5575.306801] scsi 8:0:0:1: Enclosure WD   SES Device
>>   1012 PQ: 0 ANSI: 6
>> [ 5575.307160] sd 8:0:0:0: [sdc] 976707584 512-byte logical blocks:
>> (500 GB/465 GiB)
>> [ 5575.308405] sd 8:0:0:0: [sdc] Write Protect is off
>> [ 5575.308416] sd 8:0:0:0: [sdc] Mode Sense: 47 00 10 08
>> [ 5575.309772] sd 8:0:0:0: [sdc] No Caching mode page found
>> [ 5575.309776] sd 8:0:0:0: [sdc] Assuming drive cache: write through
>> [ 5575.311176] scsi 8:0:0:1: Attached scsi generic sg6 type 13
>> [ 5575.328540]  sdc: sdc1
>> [ 5575.331026] sd 8:0:0:0: [sdc] Attached SCSI disk
>> [11315.681561] ehci-pci :00:1d.0: swiotlb buffer is full (sz: 32768 
>> bytes)
>> [11315.681565] DMA: Out of SW-IOMMU space for 32768 bytes at device 
>> :00:1d.0
>> [11315.681874] ehci-pci :00:1d.0: swiotlb buffer is full (sz: 32768 
>> bytes)
>> [11315.681876] DMA: Out of SW-IOMMU space for 32768 bytes at device 
>> :00:1d.0
>> [11315.682171] ehci-pci :00:1d.0: swiotlb buffer is full (sz: 32768 
>> bytes)
>> [11315.682174] DMA: Out of SW-IOMMU space for 32768 bytes at device 
>> :00:1d.0
>> [...and so on...]
>
> This appears to be a problem 

Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-16 Thread Alan Stern
On Thu, 16 Apr 2015, Dorian Gray wrote:

> I have tested the following kernel versions:
> - 3.18.4, 3.18.6, 3.18.7, 3.19.4 [all affected]
> - 3.17.1 [unaffected]
> - 3.17.8 [probably the last unaffected version; I'm using it currently]
> 
> Also, I've been using the very same configuration (hardware) along
> with 2.6.x, 3.2.x, 3.4.x, 3.10.x and have never encountered such a
> behavior before.
> 
> And the problem is:
> 
> When at least one external drive is plugged-in AND mounted, after ~2-4
> hours the following occurs (@11315.681561):
> 
> [ 5570.110523] usb 2-1.2: new high-speed USB device number 5 using ehci-pci
> [ 5570.852917] usb 2-1.2: New USB device found, idVendor=1058, idProduct=0730
> [ 5570.852923] usb 2-1.2: New USB device strings: Mfr=1, Product=2,
> SerialNumber=3
> [ 5570.852927] usb 2-1.2: Product: My Passport 0730
> [ 5570.852930] usb 2-1.2: Manufacturer: Western Digital
> [ 5570.852933] usb 2-1.2: SerialNumber:
> [ 5570.853517] usb-storage 2-1.2:1.0: USB Mass Storage device detected
> [ 5570.853691] scsi host8: usb-storage 2-1.2:1.0
> [ 5572.932659] scsi 8:0:0:0: Direct-Access WD   My Passport
> 0730 1012 PQ: 0 ANSI: 6
> [ 5572.933013] sd 8:0:0:0: Attached scsi generic sg5 type 0
> [ 5575.306801] scsi 8:0:0:1: Enclosure WD   SES Device
>   1012 PQ: 0 ANSI: 6
> [ 5575.307160] sd 8:0:0:0: [sdc] 976707584 512-byte logical blocks:
> (500 GB/465 GiB)
> [ 5575.308405] sd 8:0:0:0: [sdc] Write Protect is off
> [ 5575.308416] sd 8:0:0:0: [sdc] Mode Sense: 47 00 10 08
> [ 5575.309772] sd 8:0:0:0: [sdc] No Caching mode page found
> [ 5575.309776] sd 8:0:0:0: [sdc] Assuming drive cache: write through
> [ 5575.311176] scsi 8:0:0:1: Attached scsi generic sg6 type 13
> [ 5575.328540]  sdc: sdc1
> [ 5575.331026] sd 8:0:0:0: [sdc] Attached SCSI disk
> [11315.681561] ehci-pci :00:1d.0: swiotlb buffer is full (sz: 32768 bytes)
> [11315.681565] DMA: Out of SW-IOMMU space for 32768 bytes at device 
> :00:1d.0
> [11315.681874] ehci-pci :00:1d.0: swiotlb buffer is full (sz: 32768 bytes)
> [11315.681876] DMA: Out of SW-IOMMU space for 32768 bytes at device 
> :00:1d.0
> [11315.682171] ehci-pci :00:1d.0: swiotlb buffer is full (sz: 32768 bytes)
> [11315.682174] DMA: Out of SW-IOMMU space for 32768 bytes at device 
> :00:1d.0
> [...and so on...]

This appears to be a problem with the IOMMU or SWIOTLB subsystems, not
the USB subsystem.  I have CC'ed the appropriate mailing lists.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-16 Thread Alexander Duyck
On 04/16/2015 07:15 AM, Alan Stern wrote:
 On Thu, 16 Apr 2015, Dorian Gray wrote:

 I have tested the following kernel versions:
 - 3.18.4, 3.18.6, 3.18.7, 3.19.4 [all affected]
 - 3.17.1 [unaffected]
 - 3.17.8 [probably the last unaffected version; I'm using it currently]

 Also, I've been using the very same configuration (hardware) along
 with 2.6.x, 3.2.x, 3.4.x, 3.10.x and have never encountered such a
 behavior before.

 And the problem is:

 When at least one external drive is plugged-in AND mounted, after ~2-4
 hours the following occurs (@11315.681561):

 [ 5570.110523] usb 2-1.2: new high-speed USB device number 5 using ehci-pci
 [ 5570.852917] usb 2-1.2: New USB device found, idVendor=1058, idProduct=0730
 [ 5570.852923] usb 2-1.2: New USB device strings: Mfr=1, Product=2,
 SerialNumber=3
 [ 5570.852927] usb 2-1.2: Product: My Passport 0730
 [ 5570.852930] usb 2-1.2: Manufacturer: Western Digital
 [ 5570.852933] usb 2-1.2: SerialNumber:
 [ 5570.853517] usb-storage 2-1.2:1.0: USB Mass Storage device detected
 [ 5570.853691] scsi host8: usb-storage 2-1.2:1.0
 [ 5572.932659] scsi 8:0:0:0: Direct-Access WD   My Passport
 0730 1012 PQ: 0 ANSI: 6
 [ 5572.933013] sd 8:0:0:0: Attached scsi generic sg5 type 0
 [ 5575.306801] scsi 8:0:0:1: Enclosure WD   SES Device
   1012 PQ: 0 ANSI: 6
 [ 5575.307160] sd 8:0:0:0: [sdc] 976707584 512-byte logical blocks:
 (500 GB/465 GiB)
 [ 5575.308405] sd 8:0:0:0: [sdc] Write Protect is off
 [ 5575.308416] sd 8:0:0:0: [sdc] Mode Sense: 47 00 10 08
 [ 5575.309772] sd 8:0:0:0: [sdc] No Caching mode page found
 [ 5575.309776] sd 8:0:0:0: [sdc] Assuming drive cache: write through
 [ 5575.311176] scsi 8:0:0:1: Attached scsi generic sg6 type 13
 [ 5575.328540]  sdc: sdc1
 [ 5575.331026] sd 8:0:0:0: [sdc] Attached SCSI disk
 [11315.681561] ehci-pci :00:1d.0: swiotlb buffer is full (sz: 32768 
 bytes)
 [11315.681565] DMA: Out of SW-IOMMU space for 32768 bytes at device 
 :00:1d.0
 [11315.681874] ehci-pci :00:1d.0: swiotlb buffer is full (sz: 32768 
 bytes)
 [11315.681876] DMA: Out of SW-IOMMU space for 32768 bytes at device 
 :00:1d.0
 [11315.682171] ehci-pci :00:1d.0: swiotlb buffer is full (sz: 32768 
 bytes)
 [11315.682174] DMA: Out of SW-IOMMU space for 32768 bytes at device 
 :00:1d.0
 [...and so on...]
 This appears to be a problem with the IOMMU or SWIOTLB subsystems, not
 the USB subsystem.  I have CC'ed the appropriate mailing lists.

 Alan Stern

More likely would be a device driver that is DMA mapping memory but not
unmapping it after it is done resulting in the bounce buffer pool being
depleted.

You might want dump the list of drivers loaded on the system with lsmod,
and then possibly look at doing a git bisect for something introduced
between 3.17 and 3.18 since that seems to be when you started seeing
this issue.

- Alex
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-16 Thread Dorian Gray
On 16 April 2015 at 16:15, Alan Stern st...@rowland.harvard.edu wrote:
 This appears to be a problem with the IOMMU or SWIOTLB subsystems, not
 the USB subsystem.  I have CC'ed the appropriate mailing lists.

Thanks, I'm far from being a kernel expert, so was expecting it could
be wrong subsection.



On 16 April 2015 at 16:24, Suman Tripathi stripa...@apm.com wrote:
 Try increasing the SWIOTLB size to 128MB .Default is 64MB.

Ok, so I'm back to k3.18.7 (default in the latest Fatdog), although
I'm not sure what should be the exact value of swiotlb boot param?
Got totally mixed results from uncle Google - some says the unit is in
MiB, some that it's 4k pages and another that 128MiB = 65536, so I
played it safe and used swiotlb=131072.
Is this correct?
It may take a few days, but I'll let you know if it worked (or for how
long, if not).



On 16 April 2015 at 16:54, Alexander Duyck alexander.du...@gmail.com wrote:
 More likely would be a device driver that is DMA mapping memory but not
 unmapping it after it is done resulting in the bounce buffer pool being
 depleted.
 You might want dump the list of drivers loaded on the system with lsmod,
 and then possibly look at doing a git bisect for something introduced
 between 3.17 and 3.18 since that seems to be when you started seeing
 this issue.

Ok, I'll (try to) look at this, but like I said - I'm not a kernel
(nor git) expert.
Anyway, I guess I'm gonna start with this:
https://wiki.gentoo.org/wiki/Kernel_git-bisect
Who knows...perhaps I'll find something...



Thank you all for the replies.
Jake
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-16 Thread Alan Stern
On Thu, 16 Apr 2015, Dorian Gray wrote:

 I have tested the following kernel versions:
 - 3.18.4, 3.18.6, 3.18.7, 3.19.4 [all affected]
 - 3.17.1 [unaffected]
 - 3.17.8 [probably the last unaffected version; I'm using it currently]
 
 Also, I've been using the very same configuration (hardware) along
 with 2.6.x, 3.2.x, 3.4.x, 3.10.x and have never encountered such a
 behavior before.
 
 And the problem is:
 
 When at least one external drive is plugged-in AND mounted, after ~2-4
 hours the following occurs (@11315.681561):
 
 [ 5570.110523] usb 2-1.2: new high-speed USB device number 5 using ehci-pci
 [ 5570.852917] usb 2-1.2: New USB device found, idVendor=1058, idProduct=0730
 [ 5570.852923] usb 2-1.2: New USB device strings: Mfr=1, Product=2,
 SerialNumber=3
 [ 5570.852927] usb 2-1.2: Product: My Passport 0730
 [ 5570.852930] usb 2-1.2: Manufacturer: Western Digital
 [ 5570.852933] usb 2-1.2: SerialNumber:
 [ 5570.853517] usb-storage 2-1.2:1.0: USB Mass Storage device detected
 [ 5570.853691] scsi host8: usb-storage 2-1.2:1.0
 [ 5572.932659] scsi 8:0:0:0: Direct-Access WD   My Passport
 0730 1012 PQ: 0 ANSI: 6
 [ 5572.933013] sd 8:0:0:0: Attached scsi generic sg5 type 0
 [ 5575.306801] scsi 8:0:0:1: Enclosure WD   SES Device
   1012 PQ: 0 ANSI: 6
 [ 5575.307160] sd 8:0:0:0: [sdc] 976707584 512-byte logical blocks:
 (500 GB/465 GiB)
 [ 5575.308405] sd 8:0:0:0: [sdc] Write Protect is off
 [ 5575.308416] sd 8:0:0:0: [sdc] Mode Sense: 47 00 10 08
 [ 5575.309772] sd 8:0:0:0: [sdc] No Caching mode page found
 [ 5575.309776] sd 8:0:0:0: [sdc] Assuming drive cache: write through
 [ 5575.311176] scsi 8:0:0:1: Attached scsi generic sg6 type 13
 [ 5575.328540]  sdc: sdc1
 [ 5575.331026] sd 8:0:0:0: [sdc] Attached SCSI disk
 [11315.681561] ehci-pci :00:1d.0: swiotlb buffer is full (sz: 32768 bytes)
 [11315.681565] DMA: Out of SW-IOMMU space for 32768 bytes at device 
 :00:1d.0
 [11315.681874] ehci-pci :00:1d.0: swiotlb buffer is full (sz: 32768 bytes)
 [11315.681876] DMA: Out of SW-IOMMU space for 32768 bytes at device 
 :00:1d.0
 [11315.682171] ehci-pci :00:1d.0: swiotlb buffer is full (sz: 32768 bytes)
 [11315.682174] DMA: Out of SW-IOMMU space for 32768 bytes at device 
 :00:1d.0
 [...and so on...]

This appears to be a problem with the IOMMU or SWIOTLB subsystems, not
the USB subsystem.  I have CC'ed the appropriate mailing lists.

Alan Stern

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-16 Thread Suman Tripathi
On Thu, 16 Apr 2015, Dorian Gray wrote:

 I have tested the following kernel versions:
 - 3.18.4, 3.18.6, 3.18.7, 3.19.4 [all affected]
 - 3.17.1 [unaffected]
 - 3.17.8 [probably the last unaffected version; I'm using it currently]

 Also, I've been using the very same configuration (hardware) along
 with 2.6.x, 3.2.x, 3.4.x, 3.10.x and have never encountered such a
 behavior before.

 And the problem is:

 When at least one external drive is plugged-in AND mounted, after ~2-4
 hours the following occurs (@11315.681561):

 [ 5570.110523] usb 2-1.2: new high-speed USB device number 5 using ehci-pci
 [ 5570.852917] usb 2-1.2: New USB device found, idVendor=1058, idProduct=0730
 [ 5570.852923] usb 2-1.2: New USB device strings: Mfr=1, Product=2,
 SerialNumber=3
 [ 5570.852927] usb 2-1.2: Product: My Passport 0730
 [ 5570.852930] usb 2-1.2: Manufacturer: Western Digital
 [ 5570.852933] usb 2-1.2: SerialNumber:
 [ 5570.853517] usb-storage 2-1.2:1.0: USB Mass Storage device detected
 [ 5570.853691] scsi host8: usb-storage 2-1.2:1.0
 [ 5572.932659] scsi 8:0:0:0: Direct-Access WD   My Passport
 0730 1012 PQ: 0 ANSI: 6
 [ 5572.933013] sd 8:0:0:0: Attached scsi generic sg5 type 0
 [ 5575.306801] scsi 8:0:0:1: Enclosure WD   SES Device
   1012 PQ: 0 ANSI: 6
 [ 5575.307160] sd 8:0:0:0: [sdc] 976707584 512-byte logical blocks:
 (500 GB/465 GiB)
 [ 5575.308405] sd 8:0:0:0: [sdc] Write Protect is off
 [ 5575.308416] sd 8:0:0:0: [sdc] Mode Sense: 47 00 10 08
 [ 5575.309772] sd 8:0:0:0: [sdc] No Caching mode page found
 [ 5575.309776] sd 8:0:0:0: [sdc] Assuming drive cache: write through
 [ 5575.311176] scsi 8:0:0:1: Attached scsi generic sg6 type 13
 [ 5575.328540]  sdc: sdc1
 [ 5575.331026] sd 8:0:0:0: [sdc] Attached SCSI disk
 [11315.681561] ehci-pci :00:1d.0: swiotlb buffer is full (sz: 32768 bytes)
 [11315.681565] DMA: Out of SW-IOMMU space for 32768 bytes at device 
 :00:1d.0
 [11315.681874] ehci-pci :00:1d.0: swiotlb buffer is full (sz: 32768 bytes)
 [11315.681876] DMA: Out of SW-IOMMU space for 32768 bytes at device 
 :00:1d.0
 [11315.682171] ehci-pci :00:1d.0: swiotlb buffer is full (sz: 32768 bytes)
 [11315.682174] DMA: Out of SW-IOMMU space for 32768 bytes at device 
 :00:1d.0
 [...and so on...]

This appears to be a problem with the IOMMU or SWIOTLB subsystems, not
the USB subsystem.  I have CC'ed the appropriate mailing lists.

Try increasing the SWIOTLB size to 128MB .Default is 64MB.

Alan Stern

On Thu, Apr 16, 2015 at 7:45 PM, Alan Stern st...@rowland.harvard.edu wrote:
 On Thu, 16 Apr 2015, Dorian Gray wrote:

 I have tested the following kernel versions:
 - 3.18.4, 3.18.6, 3.18.7, 3.19.4 [all affected]
 - 3.17.1 [unaffected]
 - 3.17.8 [probably the last unaffected version; I'm using it currently]

 Also, I've been using the very same configuration (hardware) along
 with 2.6.x, 3.2.x, 3.4.x, 3.10.x and have never encountered such a
 behavior before.

 And the problem is:

 When at least one external drive is plugged-in AND mounted, after ~2-4
 hours the following occurs (@11315.681561):

 [ 5570.110523] usb 2-1.2: new high-speed USB device number 5 using ehci-pci
 [ 5570.852917] usb 2-1.2: New USB device found, idVendor=1058, idProduct=0730
 [ 5570.852923] usb 2-1.2: New USB device strings: Mfr=1, Product=2,
 SerialNumber=3
 [ 5570.852927] usb 2-1.2: Product: My Passport 0730
 [ 5570.852930] usb 2-1.2: Manufacturer: Western Digital
 [ 5570.852933] usb 2-1.2: SerialNumber:
 [ 5570.853517] usb-storage 2-1.2:1.0: USB Mass Storage device detected
 [ 5570.853691] scsi host8: usb-storage 2-1.2:1.0
 [ 5572.932659] scsi 8:0:0:0: Direct-Access WD   My Passport
 0730 1012 PQ: 0 ANSI: 6
 [ 5572.933013] sd 8:0:0:0: Attached scsi generic sg5 type 0
 [ 5575.306801] scsi 8:0:0:1: Enclosure WD   SES Device
   1012 PQ: 0 ANSI: 6
 [ 5575.307160] sd 8:0:0:0: [sdc] 976707584 512-byte logical blocks:
 (500 GB/465 GiB)
 [ 5575.308405] sd 8:0:0:0: [sdc] Write Protect is off
 [ 5575.308416] sd 8:0:0:0: [sdc] Mode Sense: 47 00 10 08
 [ 5575.309772] sd 8:0:0:0: [sdc] No Caching mode page found
 [ 5575.309776] sd 8:0:0:0: [sdc] Assuming drive cache: write through
 [ 5575.311176] scsi 8:0:0:1: Attached scsi generic sg6 type 13
 [ 5575.328540]  sdc: sdc1
 [ 5575.331026] sd 8:0:0:0: [sdc] Attached SCSI disk
 [11315.681561] ehci-pci :00:1d.0: swiotlb buffer is full (sz: 32768 
 bytes)
 [11315.681565] DMA: Out of SW-IOMMU space for 32768 bytes at device 
 :00:1d.0
 [11315.681874] ehci-pci :00:1d.0: swiotlb buffer is full (sz: 32768 
 bytes)
 [11315.681876] DMA: Out of SW-IOMMU space for 32768 bytes at device 
 :00:1d.0
 [11315.682171] ehci-pci :00:1d.0: swiotlb buffer is full (sz: 32768 
 bytes)
 [11315.682174] DMA: Out of SW-IOMMU space for 32768 bytes at device 
 :00:1d.0
 [...and so on...]

 This appears to be a problem with the IOMMU or SWIOTLB subsystems, not
 the USB subsystem.  I have CC'ed the appropriate mailing lists.

 Alan Stern

 

Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-16 Thread Dorian Gray
On 16 April 2015 at 20:42, Konrad Rzeszutek Wilk konrad.w...@oracle.com wrote:
 And easier way is to compile the kernel with CONFIG_DMA_API_DEBUG
 and then load the attached module.

 That should tell you who and what else is holding on the buffers.

Thanks, this will be my next step then, right after I'm done with
testing the increased SWIOTLB.

Jake
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-16 Thread Konrad Rzeszutek Wilk
On Thu, Apr 16, 2015 at 06:57:46PM +0200, Dorian Gray wrote:
 On 16 April 2015 at 16:15, Alan Stern st...@rowland.harvard.edu wrote:
  This appears to be a problem with the IOMMU or SWIOTLB subsystems, not
  the USB subsystem.  I have CC'ed the appropriate mailing lists.
 
 Thanks, I'm far from being a kernel expert, so was expecting it could
 be wrong subsection.
 
 
 
 On 16 April 2015 at 16:24, Suman Tripathi stripa...@apm.com wrote:
  Try increasing the SWIOTLB size to 128MB .Default is 64MB.
 
 Ok, so I'm back to k3.18.7 (default in the latest Fatdog), although
 I'm not sure what should be the exact value of swiotlb boot param?
 Got totally mixed results from uncle Google - some says the unit is in
 MiB, some that it's 4k pages and another that 128MiB = 65536, so I
 played it safe and used swiotlb=131072.
 Is this correct?
 It may take a few days, but I'll let you know if it worked (or for how
 long, if not).
 
 
 
 On 16 April 2015 at 16:54, Alexander Duyck alexander.du...@gmail.com wrote:
  More likely would be a device driver that is DMA mapping memory but not
  unmapping it after it is done resulting in the bounce buffer pool being
  depleted.
  You might want dump the list of drivers loaded on the system with lsmod,
  and then possibly look at doing a git bisect for something introduced
  between 3.17 and 3.18 since that seems to be when you started seeing
  this issue.
 
 Ok, I'll (try to) look at this, but like I said - I'm not a kernel
 (nor git) expert.
 Anyway, I guess I'm gonna start with this:
 https://wiki.gentoo.org/wiki/Kernel_git-bisect
 Who knows...perhaps I'll find something...

And easier way is to compile the kernel with CONFIG_DMA_API_DEBUG
and then load the attached module.

That should tell you who and what else is holding on the buffers.


 
 
 
 Thank you all for the replies.
 Jake
 ___
 iommu mailing list
 io...@lists.linux-foundation.org
 https://lists.linuxfoundation.org/mailman/listinfo/iommu
/*
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License v2.0 as published by
 * the Free Software Foundation
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 */

#include linux/module.h
#include linux/string.h
#include linux/types.h
#include linux/init.h
#include linux/stat.h
#include linux/err.h
#include linux/ctype.h
#include linux/slab.h
#include linux/limits.h
#include linux/device.h
#include linux/pci.h
#include linux/blkdev.h
#include linux/device.h

#include linux/init.h
#include linux/mm.h
#include linux/fcntl.h
#include linux/slab.h
#include linux/kmod.h
#include linux/major.h
#include linux/highmem.h
#include linux/blkdev.h
#include linux/module.h
#include linux/blkpg.h
#include linux/buffer_head.h
#include linux/mpage.h
#include linux/mount.h
#include linux/uio.h
#include linux/namei.h
#include asm/uaccess.h

#include linux/pagemap.h
#include linux/pagevec.h

#include linux/dma-debug.h

#define DUMP_DMA_FUN  0.1

MODULE_AUTHOR(Konrad Rzeszutek Wilk konrad@virtualiron);
MODULE_DESCRIPTION(dump dma);
MODULE_LICENSE(GPL);
MODULE_VERSION(DUMP_DMA_FUN);

static int __init dump_dma_init(void)
{
debug_dma_dump_mappings(NULL);
return 0;
}

static void __exit dump_dma_exit(void)
{
}

module_init(dump_dma_init);
module_exit(dump_dma_exit);