Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-15 Thread Peter Stuge
Larry Finger wrote:
> merely triggered by some interaction with ACPI and/or the BIOS.
> From what I found in looking back through the DMA error reports,
> most (if not all) people with the problem have netbook computers
> with Intel ATOM processors.

Gábor Stefanik wrote:
> Linus has also reported this issue on a Core 2 ULV. I suspect that
> the key part is deep-sleep support in the CPU.

Atom has new (lower) power states, with wakeup sequences that are
very different from previous PC platforms. I don't know exact
details, unfortunately. :\


> Also, PhoenixBIOS seems to play part in the problem.

I do know that the power sequencing is not really part of the Intel
reference platform, but rather it will be solved in a microcontroller
or embedded controller by the systems designer. The BIOS needs to
support this properly. This is all new stuff on PCs. It is very
possible that there are BIOS bugs.

If you send an email to the coreboot mailing list, it is possible
that someone there has an Atom system running coreboot and would be
willing to help test if given a card. If so, they are also very
knowledgeable about the lowlevel details.


//Peter
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-14 Thread Larry Finger
On 11/14/2009 12:51 PM, William Bourque wrote:
> 
> Ok, I tried with value of 150, 100 already and I'm recompiling to see 
> the result with 175 and 125.
> 
> 150us seems to give me the best result. As with 200 I could use the 
> wireless for several minutes and insert/remove the module. However it 
> failed when I tried to transfert a big file at full speed over LAN.  So 
> for some reason, low speed seems to work ok (althought I ad some PHY 
> Transmission error, but I suppose it is not related).
> 
> 100us is worst than the unpatched code. The wireless fails as soon as I 
> bring up the interface and the DMA errors then repeat at a very high 
> rate. Then, when I try to remove the module, "modprobe" is having a very 
> bad time. On fist try, it took around 3 minutes to be ableto remove 
> the module and on the second ttry, the machine just hanged (couldn't see 
> if there wasan oops or something).
> In the mean time, I could tell that the DMA error where still pilling 
> up, as the wireless LED was furiously flashing from red to blue (usual 
> behavior on an error).
> 
> I'll try 175usec first, then 125usec to see if anything better happen 
> but I doubt so... I think the patch just fixed a part of the problem, 
> not the whole.

I'm beginning to believe that this patch fixes nothing. If it were valid, it
would work as soon as you got below some threshold and you wouldn't find 100
being worse than 150. I did learn (or relearn) that it makes a difference if wl
has been installed previously without an intervening power off.

I have started looking from a different angle. I have the MMIO trace for wl
after a cold boot, and a similar one for b43. There are many differences - the
current exercise is to find out what they are doing.

Larry
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-14 Thread William Bourque
Larry Finger wrote:
> On 11/13/2009 06:15 PM, William Bourque wrote:
>> Larry Finger wrote:
>>> Based on a suggestion by Matthew Garrett, please try the patch below.
>>>
>>> Thanks,
>>>
>>> Larry
>>>
>>> =
>>>
>>>
>>> Index: wireless-testing/drivers/net/wireless/b43/main.c
>>> ===
>>> --- wireless-testing.orig/drivers/net/wireless/b43/main.c
>>> +++ wireless-testing/drivers/net/wireless/b43/main.c
>>> @@ -43,6 +43,7 @@
>>>  #include 
>>>  #include 
>>>  #include 
>>> +#include 
>>>
>>>  #include "b43.h"
>>>  #include "main.h"
>>> @@ -3881,6 +3882,8 @@ redo:
>>> if (!dev || b43_status(dev) < B43_STAT_STARTED)
>>> return dev;
>>>
>>> +   pm_qos_update_requirement(PM_QOS_CPU_DMA_LATENCY, "b43",
>>> + PM_QOS_DEFAULT_VALUE);
>>> /* Cancel work. Unlock to avoid deadlocks. */
>>> mutex_unlock(&wl->mutex);
>>> cancel_delayed_work_sync(&dev->periodic_work);
>>> @@ -3963,6 +3966,9 @@ static int b43_wireless_core_start(struc
>>> /* We are ready to run. */
>>> b43_set_status(dev, B43_STAT_STARTED);
>>>
>>> +   /* Set the maximum DMA latency */
>>> +   pm_qos_update_requirement(PM_QOS_CPU_DMA_LATENCY, "b43", 200);
>>> +
>>> /* Start data flow (TX/RX). */
>>> b43_mac_enable(dev);
>>> b43_write32(dev, B43_MMIO_GEN_IRQ_MASK, dev->irq_mask);
>>>
>>>
>>>
>>>
>> Well, this one did not solve the problem but it certainly did help!
>>
>> It took me some effort to make it bug again... I worked fine for 5 
>> minutes, I was able to browse the web and such. I was also 
>> removed/inserted the module a few times. It finally crashed when I tried 
>> to transfert a file of several MB.
>>
>> Maybe the 200ms delay should be less (or more?) I think I will try to 
>> change it for some arbitrary number, just to see if it helps.
> 
> You should try decreasing it. That parameter is used by 2 drivers in the 
> kernel:
> ipw2100 with a value of 175 and e1000e with a value of 55. I would expect the
> value for the other wireless device to be closer that that for a wired
> interface. Please try 150. If that also fails, try 100. BTW, the parameter is 
> in
> usec, not msec.
> 
> I'm finally encouraged that we might figure out this problem.
> 
> Larry
> 
> 
> Larry
> 

Ok, I tried with value of 150, 100 already and I'm recompiling to see 
the result with 175 and 125.

150us seems to give me the best result. As with 200 I could use the 
wireless for several minutes and insert/remove the module. However it 
failed when I tried to transfert a big file at full speed over LAN.  So 
for some reason, low speed seems to work ok (althought I ad some PHY 
Transmission error, but I suppose it is not related).

100us is worst than the unpatched code. The wireless fails as soon as I 
bring up the interface and the DMA errors then repeat at a very high 
rate. Then, when I try to remove the module, "modprobe" is having a very 
bad time. On fist try, it took around 3 minutes to be ableto remove 
the module and on the second ttry, the machine just hanged (couldn't see 
if there wasan oops or something).
In the mean time, I could tell that the DMA error where still pilling 
up, as the wireless LED was furiously flashing from red to blue (usual 
behavior on an error).

I'll try 175usec first, then 125usec to see if anything better happen 
but I doubt so... I think the patch just fixed a part of the problem, 
not the whole.

Oh and I didn't bother posting the output of dmesg as it is the exact 
same thing again but if you need it I kept it.

- William


___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-14 Thread Andrew Benton
On 14/11/09 11:24, Chris Vine wrote:
> Be aware that if you have been using the proprietary wl driver to send
> on your bug reports, you must do a cold boot before testing b43, as if
> you warm boot after having initialised the wireless device with the wl
> driver then the DMA bug disappears.

That could be it. Today I can't connect with a kernel compiled with
CONFIG_ACPI_PROCESSOR=y
I think that last night I did a warm reboot from a kernel compiled with
# CONFIG_ACPI is not set
so the firmware was initialised in the device. What I don't understand 
is why I can't get the same thing to work today. I've tried recompiling 
with latencies of 150 and 100 and they didn't work either.
The only way I can get it to connect today is with
# CONFIG_ACPI_PROCESSOR is not set
Which is how it was before Larry's new patch.

Andy
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-14 Thread Chris Vine
On Sat, 14 Nov 2009 09:52:15 +
Andrew Benton  wrote:
> It was working fine last night but I can't get that kernel to connect 
> today. Nothing has changed, I can see no reason why it was working
> and isn't working now. It feels like a hardware problem.
> I'm recompiling with a lower latency number (150)

Be aware that if you have been using the proprietary wl driver to send
on your bug reports, you must do a cold boot before testing b43, as if
you warm boot after having initialised the wireless device with the wl
driver then the DMA bug disappears.

Chris
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-14 Thread Andrew Benton
On 14/11/09 07:29, Matthew Garrett wrote:
>
> I should emphasise that this patch works by effectively disabling deep C
> states on your CPU, which in turn will increase your power consumption.
> It's very much either a workaround for broken hardware or something that
> covers up a more subtle bug somewhere else. If it turns out that it is
> required, efforts should be made to limit it to the code regions that
> absolutely require this behaviour.
>

It was working fine last night but I can't get that kernel to connect 
today. Nothing has changed, I can see no reason why it was working and 
isn't working now. It feels like a hardware problem.
I'm recompiling with a lower latency number (150)

Andy
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-14 Thread Rafał Miłecki
2009/11/14 Matthew Garrett :
> On Sat, Nov 14, 2009 at 12:41:49AM +, Andrew Benton wrote:
>
>> And it seems to be working well. No errors so far. I've just downloaded
>> a kernel, browsed slashdot a bit. I'll test it some more tomorrow but
>> this is a BIG step in the right direction. This is the first kernel
>> that's worked for me with CONFIG_ACPI_PROCESSOR=y
>
> I should emphasise that this patch works by effectively disabling deep C
> states on your CPU, which in turn will increase your power consumption.
> It's very much either a workaround for broken hardware or something that
> covers up a more subtle bug somewhere else. If it turns out that it is
> required, efforts should be made to limit it to the code regions that
> absolutely require this behaviour.

If this seems to be ATOM related, could we maybe ask Intel for
help/opinion? How like who, but Intel is quite friendly and
responsible. On the other hand they used same hack in they ipw2x00
driver... so maybe no much sense in that.

-- 
Rafał
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-13 Thread Matthew Garrett
On Sat, Nov 14, 2009 at 12:41:49AM +, Andrew Benton wrote:

> And it seems to be working well. No errors so far. I've just downloaded 
> a kernel, browsed slashdot a bit. I'll test it some more tomorrow but 
> this is a BIG step in the right direction. This is the first kernel 
> that's worked for me with CONFIG_ACPI_PROCESSOR=y

I should emphasise that this patch works by effectively disabling deep C 
states on your CPU, which in turn will increase your power consumption. 
It's very much either a workaround for broken hardware or something that 
covers up a more subtle bug somewhere else. If it turns out that it is 
required, efforts should be made to limit it to the code regions that 
absolutely require this behaviour.

-- 
Matthew Garrett | mj...@srcf.ucam.org
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-13 Thread Larry Finger
On 11/13/2009 06:15 PM, William Bourque wrote:
> 
> Larry Finger wrote:
>> Based on a suggestion by Matthew Garrett, please try the patch below.
>>
>> Thanks,
>>
>> Larry
>>
>> =
>>
>>
>> Index: wireless-testing/drivers/net/wireless/b43/main.c
>> ===
>> --- wireless-testing.orig/drivers/net/wireless/b43/main.c
>> +++ wireless-testing/drivers/net/wireless/b43/main.c
>> @@ -43,6 +43,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>>
>>  #include "b43.h"
>>  #include "main.h"
>> @@ -3881,6 +3882,8 @@ redo:
>>  if (!dev || b43_status(dev) < B43_STAT_STARTED)
>>  return dev;
>>
>> +pm_qos_update_requirement(PM_QOS_CPU_DMA_LATENCY, "b43",
>> +  PM_QOS_DEFAULT_VALUE);
>>  /* Cancel work. Unlock to avoid deadlocks. */
>>  mutex_unlock(&wl->mutex);
>>  cancel_delayed_work_sync(&dev->periodic_work);
>> @@ -3963,6 +3966,9 @@ static int b43_wireless_core_start(struc
>>  /* We are ready to run. */
>>  b43_set_status(dev, B43_STAT_STARTED);
>>
>> +/* Set the maximum DMA latency */
>> +pm_qos_update_requirement(PM_QOS_CPU_DMA_LATENCY, "b43", 200);
>> +
>>  /* Start data flow (TX/RX). */
>>  b43_mac_enable(dev);
>>  b43_write32(dev, B43_MMIO_GEN_IRQ_MASK, dev->irq_mask);
>>
>>
>>
>>
> 
> Well, this one did not solve the problem but it certainly did help!
> 
> It took me some effort to make it bug again... I worked fine for 5 
> minutes, I was able to browse the web and such. I was also 
> removed/inserted the module a few times. It finally crashed when I tried 
> to transfert a file of several MB.
> 
> Maybe the 200ms delay should be less (or more?) I think I will try to 
> change it for some arbitrary number, just to see if it helps.

You should try decreasing it. That parameter is used by 2 drivers in the kernel:
ipw2100 with a value of 175 and e1000e with a value of 55. I would expect the
value for the other wireless device to be closer that that for a wired
interface. Please try 150. If that also fails, try 100. BTW, the parameter is in
usec, not msec.

I'm finally encouraged that we might figure out this problem.

Larry


Larry

___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-13 Thread Andrew Benton
On 13/11/09 21:38, Larry Finger wrote:
> Based on a suggestion by Matthew Garrett, please try the patch below.
>

I've only been using it for a few minutes but this looks very good. I 
compile the kernel with lots of ACPI
CONFIG_ACPI=y
CONFIG_ACPI_SYSFS_POWER=y
CONFIG_ACPI_AC=y
CONFIG_ACPI_BATTERY=y
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_VIDEO=y
CONFIG_ACPI_FAN=y
CONFIG_ACPI_DOCK=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_THERMAL=y
CONFIG_ACPI_CUSTOM_DSDT_FILE=""
CONFIG_ACPI_BLACKLIST_YEAR=0
CONFIG_PNPACPI=y
CONFIG_ATA_ACPI=y

And it seems to be working well. No errors so far. I've just downloaded 
a kernel, browsed slashdot a bit. I'll test it some more tomorrow but 
this is a BIG step in the right direction. This is the first kernel 
that's worked for me with CONFIG_ACPI_PROCESSOR=y
Thanks

Andy
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-13 Thread William Bourque

Larry Finger wrote:
> Based on a suggestion by Matthew Garrett, please try the patch below.
> 
> Thanks,
> 
> Larry
> 
> =
> 
> 
> Index: wireless-testing/drivers/net/wireless/b43/main.c
> ===
> --- wireless-testing.orig/drivers/net/wireless/b43/main.c
> +++ wireless-testing/drivers/net/wireless/b43/main.c
> @@ -43,6 +43,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  #include "b43.h"
>  #include "main.h"
> @@ -3881,6 +3882,8 @@ redo:
>   if (!dev || b43_status(dev) < B43_STAT_STARTED)
>   return dev;
> 
> + pm_qos_update_requirement(PM_QOS_CPU_DMA_LATENCY, "b43",
> +   PM_QOS_DEFAULT_VALUE);
>   /* Cancel work. Unlock to avoid deadlocks. */
>   mutex_unlock(&wl->mutex);
>   cancel_delayed_work_sync(&dev->periodic_work);
> @@ -3963,6 +3966,9 @@ static int b43_wireless_core_start(struc
>   /* We are ready to run. */
>   b43_set_status(dev, B43_STAT_STARTED);
> 
> + /* Set the maximum DMA latency */
> + pm_qos_update_requirement(PM_QOS_CPU_DMA_LATENCY, "b43", 200);
> +
>   /* Start data flow (TX/RX). */
>   b43_mac_enable(dev);
>   b43_write32(dev, B43_MMIO_GEN_IRQ_MASK, dev->irq_mask);
> 
> 
> 
> 

Well, this one did not solve the problem but it certainly did help!

It took me some effort to make it bug again... I worked fine for 5 
minutes, I was able to browse the web and such. I was also 
removed/inserted the module a few times. It finally crashed when I tried 
to transfert a file of several MB.

Maybe the 200ms delay should be less (or more?) I think I will try to 
change it for some arbitrary number, just to see if it helps.

- William


[  393.039477] b43-phy1: Broadcom 4312 WLAN found (core revision 15)
[  393.130317] b43-phy1 debug: Found PHY: Analog 6, Type 5, Revision 1
[  393.130385] b43-phy1 debug: Found Radio: Manuf 0x17F, Version 0x2062, 
Revision 2
[  393.192506] phy1: Selected rate control algorithm 'minstrel'
[  393.193173] Registered led device: b43-phy1::tx
[  393.193241] Registered led device: b43-phy1::rx
[  393.193309] Registered led device: b43-phy1::radio
[  393.193674] Broadcom 43xx driver loaded [ Features: PLS, Firmware-ID: 
FW13 ]
[  421.362952] b43 ssb0:0: firmware: requesting b43/ucode15.fw
[  421.381454] b43 ssb0:0: firmware: requesting b43/lp0initvals15.fw
[  421.392366] b43 ssb0:0: firmware: requesting b43/lp0bsinitvals15.fw
[  421.540362] b43-phy1: Loading firmware version 410.2160 (2007-05-26 
15:32:10)
[  421.542820] b43-phy1 debug: b2062: Using crystal tab entry 19200 kHz.
[  422.222892] b43-phy1 debug: Chip initialized
[  422.223311] b43-phy1 debug: 64-bit DMA initialized
[  422.223442] b43-phy1 debug: QoS enabled
[  422.261116] b43-phy1 debug: Wireless interface started
[  422.280460] b43-phy1 debug: Adding Interface type 2
...skipped
[  440.655981] wlan0: associate with AP ca:fe:ca:fe:ca:fe (try 1)
[  440.667704] wlan0: RX ReassocResp from ca:fe:ca:fe:ca:fe (capab=0x411 
status=0 aid=193)
[  440.667719] wlan0: invalid aid value 193; bits 15:14 not set
[  440.667727] wlan0: associated
[  450.040199] wlan0: no IPv6 routers present
[  470.103377] b43-phy1 ERROR: PHY transmission error
[  505.569818] b43-phy1 ERROR: PHY transmission error
[  505.666591] b43-phy1 ERROR: PHY transmission error
[  506.238169] b43-phy1 ERROR: PHY transmission error
[  506.717960] b43-phy1 ERROR: PHY transmission error
[  506.821628] b43-phy1 ERROR: PHY transmission error
[  506.967619] b43-phy1 ERROR: PHY transmission error
[  545.083221] b43-phy1 ERROR: PHY transmission error
[  563.656362] b43-phy1 ERROR: PHY transmission error
[  563.702653] b43-phy1 ERROR: PHY transmission error
[  563.914893] b43-phy1 ERROR: PHY transmission error
[  563.950189] b43-phy1 ERROR: PHY transmission error
[  564.077438] b43-phy1 ERROR: PHY transmission error
[  564.109533] b43-phy1 ERROR: PHY transmission error
[  564.193257] b43-phy1 ERROR: PHY transmission error
[  564.369051] b43-phy1 ERROR: PHY transmission error
[  564.398540] b43-phy1 ERROR: PHY transmission error
[  565.168589] b43: Dump of last 20 DMA descriptors
[  565.168607] b43: Descr.  0: 0x6000 0x68 0x3619FC74 0x8000
[  565.168619] b43: Descr.  1: 0x8000 0x6E 0x36BFDEF0 0x8000
[  565.168629] b43: Descr.  2: 0x0 0x930 0x36B66020 0x8000
[  565.168639] b43: Descr.  3: 0x0 0x930 0x25989020 0x8000
[  565.168649] b43: Descr.  4: 0x6000 0x68 0x25ACEC74 0x8000
[  565.168659] b43: Descr.  5: 0x8000 0x6E 0x36BFDE82 0x8000
[  565.168669] b43: Descr.  6: 0x0 0x930 0x25806020 0x8000
[  565.168678] b43: Descr.  7: 0x0 0x930 0x2587B020 0x8000
[  565.168688] b43: Descr.  8: 0x6000 0x68 0x36B50C74 0x8000
[  565.168698] b43: Descr.  9: 0x8000 0x6E 0x36BFDE14 0x8000
[  565.168708] b43: Descr. 10: 0x0 0x930 0x259F3020 0x8000
[  565.168718] b43: Descr. 11: 0x0 0x930 0x35725020 0x8000
[  565.168728] b43: Descr. 12: 

Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-13 Thread Larry Finger
Based on a suggestion by Matthew Garrett, please try the patch below.

Thanks,

Larry

=


Index: wireless-testing/drivers/net/wireless/b43/main.c
===
--- wireless-testing.orig/drivers/net/wireless/b43/main.c
+++ wireless-testing/drivers/net/wireless/b43/main.c
@@ -43,6 +43,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "b43.h"
 #include "main.h"
@@ -3881,6 +3882,8 @@ redo:
if (!dev || b43_status(dev) < B43_STAT_STARTED)
return dev;

+   pm_qos_update_requirement(PM_QOS_CPU_DMA_LATENCY, "b43",
+ PM_QOS_DEFAULT_VALUE);
/* Cancel work. Unlock to avoid deadlocks. */
mutex_unlock(&wl->mutex);
cancel_delayed_work_sync(&dev->periodic_work);
@@ -3963,6 +3966,9 @@ static int b43_wireless_core_start(struc
/* We are ready to run. */
b43_set_status(dev, B43_STAT_STARTED);

+   /* Set the maximum DMA latency */
+   pm_qos_update_requirement(PM_QOS_CPU_DMA_LATENCY, "b43", 200);
+
/* Start data flow (TX/RX). */
b43_mac_enable(dev);
b43_write32(dev, B43_MMIO_GEN_IRQ_MASK, dev->irq_mask);




___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-13 Thread Larry Finger
On 11/13/2009 11:36 AM, Michael Buesch wrote:
> Please test the following patch. It changes more stuff related to the
> descriptor ring handling (remove the old patch first before applying this 
> one).
> http://bu3sch.de/patches/wireless-testing/20091113-1834/patches/001-b43-rewrite-dma-ring-alloc.patch

This one works fine on my 4311. I'll change to the 4315 device later.

Larry
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-13 Thread William Bourque

Michael Buesch wrote:
> Please test the following patch. It changes more stuff related to the
> descriptor ring handling (remove the old patch first before applying this 
> one).
> http://bu3sch.de/patches/wireless-testing/20091113-1834/patches/001-b43-rewrite-dma-ring-alloc.patch
> 

Hi

Here is the result. I provide two outputs because I tried few times ans 
it gave me somehow different results, especially about DMA descriptor.

I hope it helps.

- William


#1 :
[7.781844] b43-phy0: Broadcom 4312 WLAN found (core revision 15)
[7.880289] b43-phy0 debug: Found PHY: Analog 6, Type 5, Revision 1
[7.880314] b43-phy0 debug: Found Radio: Manuf 0x17F, Version 0x2062, 
Revision 2
[7.958618] phy0: Selected rate control algorithm 'minstrel'
[7.958980] Registered led device: b43-phy0::tx
[7.959029] Registered led device: b43-phy0::rx
[7.959085] Registered led device: b43-phy0::radio
[7.959430] Broadcom 43xx driver loaded [ Features: PLS, Firmware-ID: 
FW13 ]
...skipped
[   99.380312] b43 ssb0:0: firmware: requesting b43/ucode15.fw
[   99.395617] b43 ssb0:0: firmware: requesting b43/lp0initvals15.fw
[   99.407106] b43 ssb0:0: firmware: requesting b43/lp0bsinitvals15.fw
[   99.552915] b43-phy0: Loading firmware version 410.2160 (2007-05-26 
15:32:10)
[   99.555371] b43-phy0 debug: b2062: Using crystal tab entry 19200 kHz.
[  113.312878] b43-phy0 debug: Chip initialized
[  113.313311] b43-phy0 debug: 64-bit DMA initialized
[  113.313440] b43-phy0 debug: QoS enabled
[  113.353549] b43-phy0 debug: Wireless interface started
[  113.370279] b43-phy0 debug: Adding Interface type 2
[  113.393835] ADDRCONF(NETDEV_UP): wlan0: link is not ready
[  120.430217] b43-phy0 ERROR: Fatal DMA error: 0x0800, 0x, 
0x, 0x, 0x, 0x
[  120.430245] b43-phy0: Controller RESET (DMA error) ...
[  120.430276] b43: Dump of last 20 DMA descriptors
[  120.430290] b43: Descr.  0: 0x0 0x930 0x2587F020 0x8000
[  120.430300] b43: Descr.  1: 0x0 0x930 0x356F3020 0x8000
[  120.430310] b43: Descr.  2: 0x0 0x930 0x258BB020 0x8000
[  120.430319] b43: Descr.  3: 0x0 0x930 0x368C1020 0x8000
[  120.430329] b43: Descr.  4: 0x0 0x930 0x35744020 0x8000
[  120.430338] b43: Descr.  5: 0x0 0x930 0x258B9020 0x8000
[  120.430348] b43: Descr.  6: 0x0 0x930 0x35CAA020 0x8000
[  120.430358] b43: Descr.  7: 0x0 0x930 0x35771020 0x8000
[  120.430367] b43: Descr.  8: 0x0 0x930 0x259B8020 0x8000
[  120.430377] b43: Descr.  9: 0x0 0x930 0x259A5020 0x8000
[  120.430387] b43: Descr. 10: 0x1000 0x930 0x36980020 0x8000
[  120.430397] b43: Descr. 11: 0x0 0x930 0x25967020 0x8000
[  120.430407] b43: Descr. 12: 0x0 0x930 0x25947020 0x8000
[  120.430416] b43: Descr. 13: 0x0 0x930 0x35489020 0x8000
[  120.430426] b43: Descr. 14: 0x0 0x930 0x258B3020 0x8000
[  120.430436] b43: Descr. 15: 0x0 0x930 0x35DB4020 0x8000
[  120.430445] b43: Descr. 16: 0x0 0x930 0x369CE020 0x8000
[  120.430455] b43: Descr. 17: 0x0 0x930 0x3788C020 0x8000
[  120.430465] b43: Descr. 18: 0x0 0x930 0x25966020 0x8000
[  120.430474] b43: Descr. 19: 0x0 0x930 0x3554E020 0x8000
[  120.430681] b43-phy0 debug: Wireless interface stopped
[  120.430698] b43-phy0 debug: DMA-64 rx_ring: Used slots 1/64, Failed 
frames 0/0 = 0.0%, Average tries 0.00
[  120.430804] b43-phy0 debug: DMA-64 tx_ring_AC_BK: Used slots 0/256, 
Failed frames 0/0 = 0.0%, Average tries 0.00
[  120.442760] b43-phy0 debug: DMA-64 tx_ring_AC_BE: Used slots 0/256, 
Failed frames 0/0 = 0.0%, Average tries 0.00
[  120.462595] b43-phy0 debug: DMA-64 tx_ring_AC_VI: Used slots 0/256, 
Failed frames 0/0 = 0.0%, Average tries 0.00
[  120.482563] b43-phy0 debug: DMA-64 tx_ring_AC_VO: Used slots 2/256, 
Failed frames 0/11 = 0.0%, Average tries 1.00
[  120.504203] b43-phy0 debug: DMA-64 tx_ring_mcast: Used slots 0/256, 
Failed frames 0/0 = 0.0%, Average tries 0.00
[  120.762641] b43-phy0: Loading firmware version 410.2160 (2007-05-26 
15:32:10)
[  120.765121] b43-phy0 debug: b2062: Using crystal tab entry 19200 kHz.
[  134.520328] b43-phy0 debug: Chip initialized
[  134.520668] b43-phy0 debug: 64-bit DMA initialized
[  134.520799] b43-phy0 debug: QoS enabled
[  134.563410] b43-phy0 debug: Wireless interface started
[  134.563424] b43-phy0: Controller restarted
[  134.583034] b43-phy0 ERROR: Fatal DMA error: 0x0400, 0x, 
0x, 0x, 0x, 0x



#2 :
[7.834359] b43-phy0: Broadcom 4312 WLAN found (core revision 15)
[7.930184] b43-phy0 debug: Found PHY: Analog 6, Type 5, Revision 1
[7.930211] b43-phy0 debug: Found Radio: Manuf 0x17F, Version 0x2062, 
Revision 2
[8.016791] phy0: Selected rate control algorithm 'minstrel'
[8.017151] Registered led device: b43-phy0::tx
[8.017198] Registered led device: b43-phy0::rx
[8.017248] Registered led device: b43-phy0::radio
[8.017611] Broadcom 43xx driver loaded 

Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-13 Thread Michael Buesch
On Friday 13 November 2009 18:57:52 Andrew Benton wrote:
> On 13/11/09 17:36, Michael Buesch wrote:
> > Please test the following patch. It changes more stuff related to the
> > descriptor ring handling (remove the old patch first before applying this 
> > one).
> > http://bu3sch.de/patches/wireless-testing/20091113-1834/patches/001-b43-rewrite-dma-ring-alloc.patch
> >
> 
> Should I still apply Larry's patch as well?

Well, it doesn't really matter. Leave it out, if it doesn't apply cleanly 
anymore.

-- 
Greetings, Michael.
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-13 Thread Andrew Benton
On 13/11/09 17:36, Michael Buesch wrote:
> Please test the following patch. It changes more stuff related to the
> descriptor ring handling (remove the old patch first before applying this 
> one).
> http://bu3sch.de/patches/wireless-testing/20091113-1834/patches/001-b43-rewrite-dma-ring-alloc.patch
>

Should I still apply Larry's patch as well?

Andy
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-13 Thread Michael Buesch
Please test the following patch. It changes more stuff related to the
descriptor ring handling (remove the old patch first before applying this one).
http://bu3sch.de/patches/wireless-testing/20091113-1834/patches/001-b43-rewrite-dma-ring-alloc.patch

-- 
Greetings, Michael.
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-13 Thread Gábor Stefanik
On Fri, Nov 13, 2009 at 5:05 PM, Larry Finger  wrote:
> On 11/13/2009 05:16 AM, Michael Buesch wrote:
>> Ok, so my guess is that the DMA allocator simply returned high memory
>> that was unusable to the device. My new code explicitly checks for that (and 
>> a
>> few other things) and retries with GFP_DMA in case the address has illegal 
>> bits set.
>> That's the same thing we do for the frame buffers, so I don't see anything 
>> wrong with it.
>>
>> Yeah, the new code is big and scary, but I think it makes a whole lot more 
>> sense than
>> what we have now. Especially if I add some comments and do more cleanups.
>
> I agree that the new code makes a lot of sense. I'm also beginning to believe
> that the problem lies outside b43, and that it is merely triggered by some
> interaction with ACPI and/or the BIOS. From what I found in looking back 
> through
> the DMA error reports, most (if not all) people with the problem have netbook
> computers with Intel ATOM processors.
>
> I am considering posting on LKML and the ACPI mailing lists to see if we can 
> get
> any ideas from those experts. Please comment on the draft text below:
>
> 
>
> A number of users are experiencing DMA descriptor or data errors using 64-bit
> DMA with the Broadcom BCM4312 wireless device. After careful review and a
> rewrite of the DMA code in the driver, we have not been able to fix the 
> problem,
> but we have determined the following:
>
> (1) The problem is much more likely to occur on netbook systems. Several of 
> the
> developers have this card in regular notebook systems. None of us have the
> problem, thus it may occur only on netbooks, but several brand/model
> combinations are affected including Dell Inspiron 910 and Acer Aspire One 
> A150.

Linus has also reported this issue on a Core 2 ULV. I suspect that the
key part is deep-sleep support in the CPU.
Also, PhoenixBIOS seems to play part in the problem.

>
> (2) If CONFIG_ACPI_PROCESSOR is not set on affected systems, the error rate is
> much lower.
>
> (3) When a DMA descriptor error occurs, a dump of the descriptors does not
> reveal any obvious problems.
>
> I do not know enough about either the ACPI or DMA code to begin debugging in
> either of those regions. Any suggestions on debugging strategies, or links to
> similar problems would be appreciated.
>
> 
>
> Larry
>
>



-- 
Vista: [V]iruses, [I]ntruders, [S]pyware, [T]rojans and [A]dware. :-)
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-13 Thread Michael Buesch
On Friday 13 November 2009 17:05:30 Larry Finger wrote:
> (3) When a DMA descriptor error occurs, a dump of the descriptors does not
> reveal any obvious problems.

I was going to write a patch that dumps the whole affected ring. But I think we 
don't
see something suspicious there, either. So I guess you can ask for advice 
before I
did that test.
So I'll do the testpatch now, but in the meantime you can ask for advice.

-- 
Greetings, Michael.
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-13 Thread Larry Finger
On 11/13/2009 05:16 AM, Michael Buesch wrote:
> Ok, so my guess is that the DMA allocator simply returned high memory
> that was unusable to the device. My new code explicitly checks for that (and a
> few other things) and retries with GFP_DMA in case the address has illegal 
> bits set.
> That's the same thing we do for the frame buffers, so I don't see anything 
> wrong with it.
> 
> Yeah, the new code is big and scary, but I think it makes a whole lot more 
> sense than
> what we have now. Especially if I add some comments and do more cleanups.

I agree that the new code makes a lot of sense. I'm also beginning to believe
that the problem lies outside b43, and that it is merely triggered by some
interaction with ACPI and/or the BIOS. From what I found in looking back through
the DMA error reports, most (if not all) people with the problem have netbook
computers with Intel ATOM processors.

I am considering posting on LKML and the ACPI mailing lists to see if we can get
any ideas from those experts. Please comment on the draft text below:



A number of users are experiencing DMA descriptor or data errors using 64-bit
DMA with the Broadcom BCM4312 wireless device. After careful review and a
rewrite of the DMA code in the driver, we have not been able to fix the problem,
but we have determined the following:

(1) The problem is much more likely to occur on netbook systems. Several of the
developers have this card in regular notebook systems. None of us have the
problem, thus it may occur only on netbooks, but several brand/model
combinations are affected including Dell Inspiron 910 and Acer Aspire One A150.

(2) If CONFIG_ACPI_PROCESSOR is not set on affected systems, the error rate is
much lower.

(3) When a DMA descriptor error occurs, a dump of the descriptors does not
reveal any obvious problems.

I do not know enough about either the ACPI or DMA code to begin debugging in
either of those regions. Any suggestions on debugging strategies, or links to
similar problems would be appreciated.



Larry

___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-13 Thread Andrew Benton
On 13/11/09 13:18, Andrew Benton wrote:
> Since I applied Larry's patch I've not had any error's with a kernel
> compiled with
> # CONFIG_ACPI is not set

Wouldn't you know it, as soon as I sent that I got this:-

Nov 13 13:21:22 doughnut kernel: b43-phy0 ERROR: Fatal DMA error: 0x, 
0x0800, 0x, 0x, 0x, 0x
Nov 13 13:21:22 doughnut kernel: b43-phy0: Controller RESET (DMA error) ...
Nov 13 13:21:22 doughnut kernel: b43: Dump of last 20 DMA descriptors
Nov 13 13:21:22 doughnut kernel: b43: Descr.  0: 0x6000 0x604 0x36BB607C 
0x8000
Nov 13 13:21:22 doughnut kernel: b43: Descr.  1: 0x8000 0x6E 0x3684AC42 
0x8000
Nov 13 13:21:22 doughnut kernel: b43: Descr.  2: 0x6000 0x604 0x36ADD07C 
0x8000
Nov 13 13:21:22 doughnut kernel: b43: Descr.  3: 0x8000 0x6E 0x3684ABD4 
0x8000
Nov 13 13:21:22 doughnut kernel: b43: Descr.  4: 0x0 0x930 0x36AE8020 0x8000
Nov 13 13:21:22 doughnut kernel: b43: Descr.  5: 0x6000 0x604 0x3116907C 
0x8000
Nov 13 13:21:22 doughnut kernel: b43: Descr.  6: 0x8000 0x6E 0x3684AB66 
0x8000
Nov 13 13:21:22 doughnut kernel: b43: Descr.  7: 0x6000 0x604 0x36BA307C 
0x8000
Nov 13 13:21:22 doughnut kernel: b43: Descr.  8: 0x8000 0x6E 0x3684AAF8 
0x8000
Nov 13 13:21:22 doughnut kernel: b43: Descr.  9: 0x0 0x930 0x36BA0020 0x8000
Nov 13 13:21:22 doughnut kernel: b43: Descr. 10: 0x6000 0x604 0x36AF507C 
0x8000
Nov 13 13:21:22 doughnut kernel: b43: Descr. 11: 0x8000 0x6E 0x3684AA8A 
0x8000
Nov 13 13:21:22 doughnut kernel: b43: Descr. 12: 0x6000 0x604 0x3351807C 
0x8000
Nov 13 13:21:22 doughnut kernel: b43: Descr. 13: 0x8000 0x6E 0x3684AA1C 
0x8000
Nov 13 13:21:22 doughnut kernel: b43: Descr. 14: 0x0 0x930 0x3351C020 0x8000
Nov 13 13:21:22 doughnut kernel: b43: Descr. 15: 0x6000 0x604 0x3351F07C 
0x8000
Nov 13 13:21:22 doughnut kernel: b43: Descr. 16: 0x8000 0x6E 0x3684A9AE 
0x8000
Nov 13 13:21:22 doughnut kernel: b43: Descr. 17: 0x6000 0x604 0x3465007C 
0x8000
Nov 13 13:21:22 doughnut kernel: b43: Descr. 18: 0x8000 0x6E 0x3684A940 
0x8000
Nov 13 13:21:22 doughnut kernel: b43: Descr. 19: 0x0 0x930 0x36B0C020 0x8000
Nov 13 13:21:22 doughnut kernel: b43-phy0: Loading firmware version 410.2160 
(2007-05-26 15:32:10)
Nov 13 13:21:28 doughnut kernel: b43-phy0: Controller restarted
Nov 13 13:21:28 doughnut kernel: b43-phy0 ERROR: Fatal DMA error: 0x0400, 
0x, 0x, 0x, 0x, 0x
Nov 13 13:21:28 doughnut kernel: b43-phy0: Controller RESET (DMA error) ...
Nov 13 13:21:28 doughnut kernel: b43-phy0: Loading firmware version 410.2160 
(2007-05-26 15:32:10)
Nov 13 13:21:33 doughnut kernel: b43-phy0: Controller restarted
Nov 13 13:21:34 doughnut kernel: b43-phy0 ERROR: Fatal DMA error: 0x0400, 
0x, 0x, 0x, 0x, 0x
Nov 13 13:21:34 doughnut kernel: b43-phy0: Controller RESET (DMA error) ...
Nov 13 13:21:34 doughnut kernel: b43-phy0: Loading firmware version 410.2160 
(2007-05-26 15:32:10)
Nov 13 13:21:39 doughnut kernel: b43-phy0: Controller restarted

Last time I had a problem with a kernel compiled with
# CONFIG_ACPI is not set
it produced the mm/slub.c bug

Andy
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-13 Thread Andrew Benton
On 12/11/09 21:16, Michael Buesch wrote:
> Here you go:
> http://bu3sch.de/patches/wireless-testing/20091112-2213/patches/001-b43-rewrite-dma-ring-alloc.patch
> Please test this patch (also on 64bit-DMA devices that currently work).
> 
> It seriously lacks some comments, but I'll add them later if that works.
> 

I recompiled with this patch and enabled CONFIG_ACPI_PROCESSOR=y which 
produced this:-

Nov 13 12:14:08 doughnut kernel: b43-phy0 ERROR: Fatal DMA error: 0x0800, 
0x, 0x, 0x, 0x, 0x
Nov 13 12:14:08 doughnut kernel: b43-phy0: Controller RESET (DMA error) ...
Nov 13 12:14:08 doughnut kernel: b43: Dump of last 20 DMA descriptors
Nov 13 12:14:08 doughnut kernel: b43: Descr.  0: 0x0 0x930 0x35CD5020 0x8000
Nov 13 12:14:08 doughnut kernel: b43: Descr.  1: 0x0 0x930 0x35CDE020 0x8000
Nov 13 12:14:08 doughnut kernel: b43: Descr.  2: 0x6000 0x74 0x36894C28 
0x8000
Nov 13 12:14:08 doughnut kernel: b43: Descr.  3: 0x8000 0x6E 0x35CC9810 
0x8000
Nov 13 12:14:08 doughnut kernel: b43: Descr.  4: 0x0 0x930 0x35CDB020 0x8000
Nov 13 12:14:08 doughnut kernel: b43: Descr.  5: 0x6000 0x30 0x36AEC420 
0x8000
Nov 13 12:14:08 doughnut kernel: b43: Descr.  6: 0x8000 0x6E 0x35740672 
0x8000
Nov 13 12:14:08 doughnut kernel: b43: Descr.  7: 0x0 0x930 0x35CD9020 0x8000
Nov 13 12:14:08 doughnut kernel: b43: Descr.  8: 0x0 0x930 0x35CDC020 0x8000
Nov 13 12:14:08 doughnut kernel: b43: Descr.  9: 0x0 0x930 0x35CDD020 0x8000
Nov 13 12:14:08 doughnut kernel: b43: Descr. 10: 0x0 0x930 0x35CDF020 0x8000
Nov 13 12:14:08 doughnut kernel: b43: Descr. 11: 0x0 0x930 0x35CDA020 0x8000
Nov 13 12:14:08 doughnut kernel: b43: Descr. 12: 0x6000 0x6F 0x36AEC828 
0x8000
Nov 13 12:14:08 doughnut kernel: b43: Descr. 13: 0x8000 0x6E 0x35CC97A2 
0x8000
Nov 13 12:14:08 doughnut kernel: b43: Descr. 14: 0x0 0x930 0x3712E020 0x8000
Nov 13 12:14:08 doughnut kernel: b43: Descr. 15: 0x0 0x930 0x3712F020 0x8000
Nov 13 12:14:08 doughnut kernel: b43: Descr. 16: 0x6000 0x6F 0x36896428 
0x8000
Nov 13 12:14:08 doughnut kernel: b43: Descr. 17: 0x8000 0x6E 0x35CC9734 
0x8000
Nov 13 12:14:08 doughnut kernel: b43: Descr. 18: 0x0 0x930 0x36106020 0x8000
Nov 13 12:14:08 doughnut kernel: b43: Descr. 19: 0x0 0x930 0x36105020 0x8000
Nov 13 12:14:09 doughnut kernel: b43-phy0: Loading firmware version 410.2160 
(2007-05-26 15:32:10)
Nov 13 12:14:14 doughnut kernel: b43-phy0: Controller restarted
Nov 13 12:14:14 doughnut kernel: b43-phy0 ERROR: Fatal DMA error: 0x0400, 
0x, 0x, 0x, 0x, 0x
Nov 13 12:14:14 doughnut kernel: b43-phy0: Controller RESET (DMA error) ...
Nov 13 12:14:14 doughnut kernel: b43-phy0: Loading firmware version 410.2160 
(2007-05-26 15:32:10)
Nov 13 12:14:20 doughnut kernel: b43-phy0: Controller restarted
Nov 13 12:14:20 doughnut kernel: b43-phy0 ERROR: Fatal DMA error: 0x0400, 
0x, 0x, 0x, 0x, 0x
Nov 13 12:14:20 doughnut kernel: b43-phy0: Controller RESET (DMA error) ...
Nov 13 12:14:20 doughnut kernel: b43-phy0: Loading firmware version 410.2160 
(2007-05-26 15:32:10)
Nov 13 12:14:26 doughnut kernel: b43-phy0: Controller restarted
Nov 13 12:14:26 doughnut kernel: b43-phy0 ERROR: Fatal DMA error: 0x0400, 
0x, 0x, 0x, 0x, 0x
Nov 13 12:14:26 doughnut kernel: b43-phy0: Controller RESET (DMA error) ...
Nov 13 12:14:26 doughnut kernel: b43-phy0: Loading firmware version 410.2160 
(2007-05-26 15:32:10)
Nov 13 12:14:31 doughnut kernel: b43-phy0: Controller restarted
Nov 13 12:14:32 doughnut kernel: b43-phy0 ERROR: Fatal DMA error: 0x0400, 
0x, 0x, 0x, 0x, 0x
Nov 13 12:14:32 doughnut kernel: b43-phy0: Controller RESET (DMA error) ...
Nov 13 12:14:32 doughnut kernel: b43-phy0: Loading firmware version 410.2160 
(2007-05-26 15:32:10)
Nov 13 12:14:37 doughnut kernel: b43-phy0: Controller restarted


Earlier I had a very similar error with a kernel compiled without 
Michael's patch but with Larry's patch. This one was compiled with
# CONFIG_ACPI_PROCESSOR is not set 
and ran for 15 mins before it produced this:- 

Nov 13 09:26:20 doughnut kernel: b43-phy0 ERROR: Fatal DMA error: 0x0800, 
0x, 0x, 0x, 0x, 0x
Nov 13 09:26:20 doughnut kernel: b43-phy0: Controller RESET (DMA error) ...
Nov 13 09:26:20 doughnut kernel: b43: Dump of last 20 DMA descriptors
Nov 13 09:26:20 doughnut kernel: b43: Descr.  0: 0x6000 0x30 0x36A06020 
0x8000
Nov 13 09:26:20 doughnut kernel: b43: Descr.  1: 0x8000 0x6E 0x360721F2 
0x8000
Nov 13 09:26:20 doughnut kernel: b43: Descr.  2: 0x0 0x930 0x3020 0x8000
Nov 13 09:26:20 doughnut kernel: b43: Descr.  3: 0x0 0x930 0x36662020 0x8000
Nov 13 09:26:20 doughnut kernel: b43: Descr.  4: 0x0 0x930 0x36661020 0x800

Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-13 Thread Michael Buesch
On Friday 13 November 2009 01:02:44 Larry Finger wrote:
> On 11/12/2009 05:57 PM, Michael Buesch wrote:
> > On Friday 13 November 2009 00:23:59 Larry Finger wrote:
> >> No, then was a 14e4:4311. I have now installed that same card and it seems 
> >> to be
> >> working without the workaround. When I had that problem, I had a different
> >> laptop than I do now, thus it is not possible to reduplicate the setup. I 
> >> am
> >> also using later firmware - version 508.1107. There may or may not be any
> >> differences in the rev 13 fw files.
> > 
> > Did the machine have more than 1G RAM?
> 
> Yes - 1.5 GB.

Ok, so my guess is that the DMA allocator simply returned high memory
that was unusable to the device. My new code explicitly checks for that (and a
few other things) and retries with GFP_DMA in case the address has illegal bits 
set.
That's the same thing we do for the frame buffers, so I don't see anything 
wrong with it.

Yeah, the new code is big and scary, but I think it makes a whole lot more 
sense than
what we have now. Especially if I add some comments and do more cleanups.

-- 
Greetings, Michael.
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread Larry Finger
On 11/12/2009 05:57 PM, Michael Buesch wrote:
> On Friday 13 November 2009 00:23:59 Larry Finger wrote:
>> No, then was a 14e4:4311. I have now installed that same card and it seems 
>> to be
>> working without the workaround. When I had that problem, I had a different
>> laptop than I do now, thus it is not possible to reduplicate the setup. I am
>> also using later firmware - version 508.1107. There may or may not be any
>> differences in the rev 13 fw files.
> 
> Did the machine have more than 1G RAM?

Yes - 1.5 GB.
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread Michael Buesch
On Friday 13 November 2009 00:23:59 Larry Finger wrote:
> No, then was a 14e4:4311. I have now installed that same card and it seems to 
> be
> working without the workaround. When I had that problem, I had a different
> laptop than I do now, thus it is not possible to reduplicate the setup. I am
> also using later firmware - version 508.1107. There may or may not be any
> differences in the rev 13 fw files.

Did the machine have more than 1G RAM?

-- 
Greetings, Michael.
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread Larry Finger
On 11/12/2009 05:08 PM, Michael Buesch wrote:
> On Friday 13 November 2009 00:04:50 Larry Finger wrote:
>> On 11/12/2009 03:16 PM, Michael Buesch wrote:
>>> On Thursday 12 November 2009 20:33:54 Larry Finger wrote:
 While Michael is coming up with a test patch,
>>>
>>> Here you go:
>>> http://bu3sch.de/patches/wireless-testing/20091112-2213/patches/001-b43-rewrite-dma-ring-alloc.patch
>>> Please test this patch (also on 64bit-DMA devices that currently work).
>>>
>>> It seriously lacks some comments, but I'll add them later if that works.
>>
>> The patch did not break my working 4315 (LP PHY) system.
> 
> Did this system need the always-set-GFP_DMA-workaround?

No, then was a 14e4:4311. I have now installed that same card and it seems to be
working without the workaround. When I had that problem, I had a different
laptop than I do now, thus it is not possible to reduplicate the setup. I am
also using later firmware - version 508.1107. There may or may not be any
differences in the rev 13 fw files.

If anyone reading this list has an HP dv2100 series laptop with a BCM4311, does
this patch work for you?

Larry
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread Michael Buesch
On Friday 13 November 2009 00:04:50 Larry Finger wrote:
> On 11/12/2009 03:16 PM, Michael Buesch wrote:
> > On Thursday 12 November 2009 20:33:54 Larry Finger wrote:
> >> While Michael is coming up with a test patch,
> > 
> > Here you go:
> > http://bu3sch.de/patches/wireless-testing/20091112-2213/patches/001-b43-rewrite-dma-ring-alloc.patch
> > Please test this patch (also on 64bit-DMA devices that currently work).
> > 
> > It seriously lacks some comments, but I'll add them later if that works.
> 
> The patch did not break my working 4315 (LP PHY) system.

Did this system need the always-set-GFP_DMA-workaround?

-- 
Greetings, Michael.
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread Larry Finger
On 11/12/2009 03:16 PM, Michael Buesch wrote:
> On Thursday 12 November 2009 20:33:54 Larry Finger wrote:
>> While Michael is coming up with a test patch,
> 
> Here you go:
> http://bu3sch.de/patches/wireless-testing/20091112-2213/patches/001-b43-rewrite-dma-ring-alloc.patch
> Please test this patch (also on 64bit-DMA devices that currently work).
> 
> It seriously lacks some comments, but I'll add them later if that works.

The patch did not break my working 4315 (LP PHY) system.

Larry
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread Michael Buesch
On Thursday 12 November 2009 22:59:26 William Bourque wrote:
> Michael Buesch wrote:
> > On Thursday 12 November 2009 22:34:00 William Bourque wrote:
> >> Michael Buesch wrote:
> >>> On Thursday 12 November 2009 20:32:32 William Bourque wrote:
>  Sorry for the late reply... I seem to have the exact same bug here. Do 
>  you need more people to run the diagnostic patch?
> >>> Well, it doesn't hurt.
> >>>
> >> Here we go.
> >>
> >> I think we can observe the same problem :
> >> [  109.273027] b43: Descr.  0: 0x1000 0x930 0x26296020 0x8000
> >> [  109.273038] b43: Descr.  1: 0x0 0x930 0x26295020 0x8000
> >> [  109.273047] b43: Descr.  2: 0x0 0x930 0x26294020 0x8000
> >> ...
> >> This Decr. 0 seems weird to me.
> > 
> > No it looks OK to me.
> > descr0 most likely is the last descriptor in your RX ring, so it has
> > the descriptor-table-end bit set.
> > 
> >> Note : I'm using 4kb kernel stack and SLAB (just saying, if I read 
> >> correctly the last few mails, you were suspecting something about this).
> > 
> > No this has nothing to do with the stack.
> > 
> >> I will try with Michael's patch and see what it does.
> > 
> > Yeah, please do so.
> > 
> 
> It just finished compiling. Note that I applied the new patch over the 
> previous diagnostic patch, not sure if it can cause a problem somehow.

That should be fine.

> Here is the output of dmesg. Look like it didn't has any effect in my case :

Ok, thanks for testing.

I'd still like larry to test the patch, because AFAIR he was the one
who added the always-GFP_DMA-on-64bit workaround. This patch removes it.
The always-GFP_DMA doesn't make any sense, however it did help in the past
to get a device working (I hope Larry remembers which one that was).

-- 
Greetings, Michael.
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread William Bourque
Michael Buesch wrote:
> On Thursday 12 November 2009 22:34:00 William Bourque wrote:
>> Michael Buesch wrote:
>>> On Thursday 12 November 2009 20:32:32 William Bourque wrote:
>>>> Sorry for the late reply... I seem to have the exact same bug here. Do 
>>>> you need more people to run the diagnostic patch?
>>> Well, it doesn't hurt.
>>>
>> Here we go.
>>
>> I think we can observe the same problem :
>> [  109.273027] b43: Descr.  0: 0x1000 0x930 0x26296020 0x8000
>> [  109.273038] b43: Descr.  1: 0x0 0x930 0x26295020 0x8000
>> [  109.273047] b43: Descr.  2: 0x0 0x930 0x26294020 0x8000
>> ...
>> This Decr. 0 seems weird to me.
> 
> No it looks OK to me.
> descr0 most likely is the last descriptor in your RX ring, so it has
> the descriptor-table-end bit set.
> 
>> Note : I'm using 4kb kernel stack and SLAB (just saying, if I read 
>> correctly the last few mails, you were suspecting something about this).
> 
> No this has nothing to do with the stack.
> 
>> I will try with Michael's patch and see what it does.
> 
> Yeah, please do so.
> 

It just finished compiling. Note that I applied the new patch over the 
previous diagnostic patch, not sure if it can cause a problem somehow.

r...@mini ~ # uname -a
Linux mini 2.6.32-rc6-wl-wireless-testing-b43-DMA-41742-g3e14c6f-dirty 
#2 SMP PREEMPT Thu Nov 12 16:37:26 EST 2009 i686 GNU/Linux

Here is the output of dmesg. Look like it didn't has any effect in my case :

[7.512335] ACPI: WMI: Mapper loaded
[7.520213] b43-phy0: Broadcom 4312 WLAN found (core revision 15)
[7.620163] b43-phy0 debug: Found PHY: Analog 6, Type 5, Revision 1
[7.620190] b43-phy0 debug: Found Radio: Manuf 0x17F, Version 0x2062, 
Revision 2
[7.702892] phy0: Selected rate control algorithm 'minstrel'
[7.703252] Registered led device: b43-phy0::tx
[7.703301] Registered led device: b43-phy0::rx
[7.703360] Registered led device: b43-phy0::radio
[7.703709] Broadcom 43xx driver loaded [ Features: PLS, Firmware-ID: 
FW13 ]
...skipped
[  167.140300] b43 ssb0:0: firmware: requesting b43/ucode15.fw
[  167.155323] b43 ssb0:0: firmware: requesting b43/lp0initvals15.fw
[  167.166946] b43 ssb0:0: firmware: requesting b43/lp0bsinitvals15.fw
[  167.320293] b43-phy0: Loading firmware version 410.2160 (2007-05-26 
15:32:10)
[  167.322759] b43-phy0 debug: b2062: Using crystal tab entry 19200 kHz.
[  181.210227] b43-phy0 debug: Chip initialized
[  181.210588] b43-phy0 debug: 64-bit DMA initialized
[  181.210719] b43-phy0 debug: QoS enabled
[  181.251102] b43-phy0 debug: Wireless interface started
[  181.251113] b43-phy0 debug: Adding Interface type 2
[  181.270285] b43-phy0 ERROR: Fatal DMA error: 0x0400, 0x, 
0x, 0x, 0x, 0x
[  181.270321] b43-phy0: Controller RESET (DMA error) ...
[  181.270334] b43: Dump of last 20 DMA descriptors
[  181.270345] b43: Descr.  0: 0x1000 0x930 0x2FABC020 0x8000
[  181.270355] b43: Descr.  1: 0x0 0x930 0x2FABB020 0x8000
[  181.270365] b43: Descr.  2: 0x0 0x930 0x2FABA020 0x8000
[  181.270374] b43: Descr.  3: 0x0 0x930 0x2FAB9020 0x8000
[  181.270384] b43: Descr.  4: 0x0 0x930 0x2FAB8020 0x8000
[  181.270393] b43: Descr.  5: 0x0 0x930 0x2FA3F020 0x8000
[  181.270403] b43: Descr.  6: 0x0 0x930 0x2FA3E020 0x8000
[  181.270413] b43: Descr.  7: 0x0 0x930 0x2FA3D020 0x8000
[  181.270422] b43: Descr.  8: 0x0 0x930 0x2FA3C020 0x8000
[  181.270432] b43: Descr.  9: 0x0 0x930 0x2FA2F020 0x8000
[  181.270442] b43: Descr. 10: 0x0 0x930 0x2FA2E020 0x8000
[  181.270451] b43: Descr. 11: 0x0 0x930 0x2FA2D020 0x8000
[  181.270469] b43: Descr. 12: 0x0 0x930 0x2FA2C020 0x8000
[  181.270474] b43: Descr. 13: 0x0 0x930 0x2FA03020 0x8000
[  181.270479] b43: Descr. 14: 0x0 0x930 0x2FA02020 0x8000
[  181.270484] b43: Descr. 15: 0x0 0x930 0x2FA01020 0x8000
[  181.270489] b43: Descr. 16: 0x0 0x930 0x2FA00020 0x8000
[  181.270493] b43: Descr. 17: 0x0 0x930 0x2FAAB020 0x8000
[  181.270498] b43: Descr. 18: 0x0 0x930 0x2FAAA020 0x8000
[  181.270503] b43: Descr. 19: 0x0 0x930 0x2FAA9020 0x8000
[  181.270509] b43-phy0 ERROR: Fatal DMA error: 0x0400, 0x, 
0x, 0x, 0x, 0x
[  181.270516] b43-phy0: Controller RESET (DMA error) ...
[  181.270691] b43-phy0 debug: Wireless interface stopped
[  181.270702] b43-phy0 debug: DMA-64 rx_ring: Used slots 0/64, Failed 
frames 0/0 = 0.0%, Average tries 0.00
[  181.270770] b43-phy0 debug: DMA-64 tx_ring_AC_BK: Used slots 0/256, 
Failed frames 0/0 = 0.0%, Average tries 0.00
[  181.290238] b43-phy0 debug: DMA-64 tx_ring_AC_BE: Used slots 0/256, 
Failed frames 0/0 = 0.0%, Average tries 0.00
[  181.310140] b43-phy0 debug: DMA-64 tx_ring_AC_VI: Used slots 0/256, 
Failed fra

Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread Michael Buesch
On Thursday 12 November 2009 22:34:00 William Bourque wrote:
> Michael Buesch wrote:
> > On Thursday 12 November 2009 20:32:32 William Bourque wrote:
> >> Sorry for the late reply... I seem to have the exact same bug here. Do 
> >> you need more people to run the diagnostic patch?
> > 
> > Well, it doesn't hurt.
> > 
> 
> Here we go.
> 
> I think we can observe the same problem :
> [  109.273027] b43: Descr.  0: 0x1000 0x930 0x26296020 0x8000
> [  109.273038] b43: Descr.  1: 0x0 0x930 0x26295020 0x8000
> [  109.273047] b43: Descr.  2: 0x0 0x930 0x26294020 0x8000
> ...
> This Decr. 0 seems weird to me.

No it looks OK to me.
descr0 most likely is the last descriptor in your RX ring, so it has
the descriptor-table-end bit set.

> Note : I'm using 4kb kernel stack and SLAB (just saying, if I read 
> correctly the last few mails, you were suspecting something about this).

No this has nothing to do with the stack.

> I will try with Michael's patch and see what it does.

Yeah, please do so.

-- 
Greetings, Michael.
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread William Bourque
Michael Buesch wrote:
> On Thursday 12 November 2009 20:32:32 William Bourque wrote:
>> Sorry for the late reply... I seem to have the exact same bug here. Do 
>> you need more people to run the diagnostic patch?
> 
> Well, it doesn't hurt.
> 

Here we go.

I think we can observe the same problem :
[  109.273027] b43: Descr.  0: 0x1000 0x930 0x26296020 0x8000
[  109.273038] b43: Descr.  1: 0x0 0x930 0x26295020 0x8000
[  109.273047] b43: Descr.  2: 0x0 0x930 0x26294020 0x8000
...
This Decr. 0 seems weird to me.

Note : I'm using 4kb kernel stack and SLAB (just saying, if I read 
correctly the last few mails, you were suspecting something about this).

I will try with Michael's patch and see what it does. Ask me if you need 
to test something else.

- William



[   95.270306] b43 ssb0:0: firmware: requesting b43/ucode15.fw
[   95.286145] b43 ssb0:0: firmware: requesting b43/lp0initvals15.fw
[   95.297707] b43 ssb0:0: firmware: requesting b43/lp0bsinitvals15.fw
[   95.450299] b43-phy0: Loading firmware version 410.2160 (2007-05-26 
15:32:10)
[   95.452754] b43-phy0 debug: b2062: Using crystal tab entry 19200 kHz.
[  109.212708] b43-phy0 debug: Chip initialized
[  109.213037] b43-phy0 debug: 64-bit DMA initialized
[  109.213169] b43-phy0 debug: QoS enabled
[  109.253562] b43-phy0 debug: Wireless interface started
[  109.272964] b43-phy0 ERROR: Fatal DMA error: 0x0400, 0x, 
0x, 0x, 0x, 0x
[  109.272990] b43-phy0: Controller RESET (DMA error) ...
[  109.273017] b43: Dump of last 20 DMA descriptors
[  109.273027] b43: Descr.  0: 0x1000 0x930 0x26296020 0x8000
[  109.273038] b43: Descr.  1: 0x0 0x930 0x26295020 0x8000
[  109.273047] b43: Descr.  2: 0x0 0x930 0x26294020 0x8000
[  109.273057] b43: Descr.  3: 0x0 0x930 0x26293020 0x8000
[  109.273067] b43: Descr.  4: 0x0 0x930 0x26292020 0x8000
[  109.273077] b43: Descr.  5: 0x0 0x930 0x26291020 0x8000
[  109.273086] b43: Descr.  6: 0x0 0x930 0x26290020 0x8000
[  109.273096] b43: Descr.  7: 0x0 0x930 0x262B7020 0x8000
[  109.273106] b43: Descr.  8: 0x0 0x930 0x262B6020 0x8000
[  109.273115] b43: Descr.  9: 0x0 0x930 0x262B5020 0x8000
[  109.273125] b43: Descr. 10: 0x0 0x930 0x262B4020 0x8000
[  109.273135] b43: Descr. 11: 0x0 0x930 0x2628B020 0x8000
[  109.273144] b43: Descr. 12: 0x0 0x930 0x2628A020 0x8000
[  109.273154] b43: Descr. 13: 0x0 0x930 0x26289020 0x8000
[  109.273164] b43: Descr. 14: 0x0 0x930 0x26288020 0x8000
[  109.273174] b43: Descr. 15: 0x0 0x930 0x2622B020 0x8000
[  109.273183] b43: Descr. 16: 0x0 0x930 0x2622A020 0x8000
[  109.273193] b43: Descr. 17: 0x0 0x930 0x2627B020 0x8000
[  109.273203] b43: Descr. 18: 0x0 0x930 0x2627A020 0x8000
[  109.273212] b43: Descr. 19: 0x0 0x930 0x26255020 0x8000
[  109.273230] b43-phy0 ERROR: Fatal DMA error: 0x0400, 0x, 
0x, 0x, 0x, 0x
[  109.273250] b43-phy0: Controller RESET (DMA error) ...
[  109.273279] b43-phy0 debug: Adding Interface type 2
[  109.273550] b43-phy0 debug: Wireless interface stopped
[  109.273562] b43-phy0 debug: DMA-64 rx_ring: Used slots 0/64, Failed 
frames 0/0 = 0.0%, Average tries 0.00
[  109.273634] b43-phy0 debug: DMA-64 tx_ring_AC_BK: Used slots 0/256, 
Failed frames 0/0 = 0.0%, Average tries 0.00
[  109.290240] b43-phy0 debug: DMA-64 tx_ring_AC_BE: Used slots 0/256, 
Failed frames 0/0 = 0.0%, Average tries 0.00
[  109.310088] b43-phy0 debug: DMA-64 tx_ring_AC_VI: Used slots 0/256, 
Failed frames 0/0 = 0.0%, Average tries 0.00
[  109.330240] b43-phy0 debug: DMA-64 tx_ring_AC_VO: Used slots 0/256, 
Failed frames 0/0 = 0.0%, Average tries 0.00
[  109.350234] b43-phy0 debug: DMA-64 tx_ring_mcast: Used slots 0/256, 
Failed frames 0/0 = 0.0%, Average tries 0.00
[  109.620178] b43-phy0: Loading firmware version 410.2160 (2007-05-26 
15:32:10)
[  109.622613] b43-phy0 debug: b2062: Using crystal tab entry 19200 kHz.
[  123.396317] b43-phy0 debug: Chip initialized
[  123.396632] b43-phy0 debug: 64-bit DMA initialized
[  123.396766] b43-phy0 debug: QoS enabled
[  123.431086] b43-phy0 debug: Wireless interface started
[  123.431098] b43-phy0: Controller restarted
[  123.450896] b43-phy0 ERROR: Fatal DMA error: 0x0400, 0x, 
0x, 0x, 0x, 0x

___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread Michael Buesch
On Thursday 12 November 2009 20:33:54 Larry Finger wrote:
> While Michael is coming up with a test patch,

Here you go:
http://bu3sch.de/patches/wireless-testing/20091112-2213/patches/001-b43-rewrite-dma-ring-alloc.patch
Please test this patch (also on 64bit-DMA devices that currently work).

It seriously lacks some comments, but I'll add them later if that works.

-- 
Greetings, Michael.
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread Michael Buesch
On Thursday 12 November 2009 21:10:59 Larry Finger wrote:
> Do the address_low values for 8 and 9 look right? They
> should be aligned on a 4K boundary.

Is this really a requirement? I think the 4k alignment is only required
for the descriptor memory. We never guaranteed any alignment for the skbs.

-- 
Greetings, Michael.
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread Larry Finger
On 11/12/2009 01:48 PM, Michael Buesch wrote:
>> Now we have some progress. You will note the difference in the control words
>> (first 2 columns) for descriptors 8 & 9. They are wrong.
> 
> What do you think is wrong here? I think the control words are OK.

At the point where I captured them, I didn't think the flags had been set, but
now I see that they have. Do the address_low values for 8 and 9 look right? They
should be aligned on a 4K boundary.

Larry


___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread Michael Buesch
On Thursday 12 November 2009 20:32:32 William Bourque wrote:
> Sorry for the late reply... I seem to have the exact same bug here. Do 
> you need more people to run the diagnostic patch?

Well, it doesn't hurt.

-- 
Greetings, Michael.
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread Michael Buesch
On Thursday 12 November 2009 20:33:54 Larry Finger wrote:
> > Nov 12 18:40:43 doughnut kernel: b43: Descr.  0: 0x0 0x930 0x364BD020 
> > 0x8000
> > Nov 12 18:40:43 doughnut kernel: b43: Descr.  1: 0x0 0x930 0x364BF020 
> > 0x8000
> > Nov 12 18:40:43 doughnut kernel: b43: Descr.  2: 0x0 0x930 0x364B7020 
> > 0x8000
> > Nov 12 18:40:43 doughnut kernel: b43: Descr.  3: 0x0 0x930 0x364B6020 
> > 0x8000
> > Nov 12 18:40:43 doughnut kernel: b43: Descr.  4: 0x0 0x930 0x364B5020 
> > 0x8000
> > Nov 12 18:40:43 doughnut kernel: b43: Descr.  5: 0x0 0x930 0x364B4020 
> > 0x8000
> > Nov 12 18:40:43 doughnut kernel: b43: Descr.  6: 0x0 0x930 0x364A8020 
> > 0x8000
> > Nov 12 18:40:43 doughnut kernel: b43: Descr.  7: 0x0 0x930 0x364AF020 
> > 0x8000
> > Nov 12 18:40:43 doughnut kernel: b43: Descr.  8: 0x0 0x930 0x364AE020 
> > 0x8000
> > Nov 12 18:40:43 doughnut kernel: b43: Descr.  9: 0x6000 0x74 0x36972428 
> > 0x8000
> > Nov 12 18:40:43 doughnut kernel: b43: Descr. 10: 0x8000 0x6E 0x36A1595A 
> > 0x8000
> > Nov 12 18:40:43 doughnut kernel: b43: Descr. 11: 0x0 0x930 0x364AD020 
> > 0x8000
> > Nov 12 18:40:43 doughnut kernel: b43: Descr. 12: 0x0 0x930 0x364AC020 
> > 0x8000
> > Nov 12 18:40:43 doughnut kernel: b43: Descr. 13: 0x0 0x930 0x364AB020 
> > 0x8000
> > Nov 12 18:40:43 doughnut kernel: b43: Descr. 14: 0x0 0x930 0x364AA020 
> > 0x8000
> > Nov 12 18:40:43 doughnut kernel: b43: Descr. 15: 0x0 0x930 0x364A9020 
> > 0x8000
> > Nov 12 18:40:43 doughnut kernel: b43: Descr. 16: 0x0 0x930 0x36517020 
> > 0x8000
> > Nov 12 18:40:43 doughnut kernel: b43: Descr. 17: 0x0 0x930 0x36516020 
> > 0x8000
> > Nov 12 18:40:43 doughnut kernel: b43: Descr. 18: 0x0 0x930 0x36515020 
> > 0x8000
> > Nov 12 18:40:43 doughnut kernel: b43: Descr. 19: 0x0 0x930 0x36514020 
> > 0x8000
> 
> Now we have some progress. You will note the difference in the control words
> (first 2 columns) for descriptors 8 & 9. They are wrong.

What do you think is wrong here? I think the control words are OK.

-- 
Greetings, Michael.
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread William Bourque

Michael Buesch wrote:
> On Thursday 12 November 2009 19:45:45 Andrew Benton wrote:
>> On 12/11/09 17:14, Larry Finger wrote:
>>> I guess I'm a failure at writing diagnostic patches. Until there is a DMA 
>>> error,
>>> the only effect of the patch is to add a little extra time to the routine 
>>> that
>>> fills in the descriptor structure, and it adds to the data and code size. 
>>> If any
>>> of those changes have the effect of "fixing" your problem, then we have 
>>> severe
>>> difficulties. I expect that you will get the error sooner or later.
>>>
>> Well to provoke it a little I recompiled with all the options on the 
>> ACPI menu set to y and now it fails to connect like so:-
>>
>>
>> Nov 12 18:40:38 doughnut kernel: wlan0: deauthenticating from 
>> 00:1e:2a:27:7e:62 by local choice (reason=3)
>> Nov 12 18:40:38 doughnut kernel: wlan0: direct probe to AP 00:1e:2a:27:7e:62 
>> (try 1)
>> Nov 12 18:40:38 doughnut kernel: wlan0: direct probe responded
>> Nov 12 18:40:38 doughnut kernel: wlan0: authenticate with AP 
>> 00:1e:2a:27:7e:62 (try 1)
>> Nov 12 18:40:38 doughnut kernel: wlan0: authenticated
>> Nov 12 18:40:38 doughnut kernel: wlan0: associate with AP 00:1e:2a:27:7e:62 
>> (try 1)
>> Nov 12 18:40:38 doughnut kernel: wlan0: RX AssocResp from 00:1e:2a:27:7e:62 
>> (capab=0x411 status=0 aid=1)
>> Nov 12 18:40:38 doughnut kernel: wlan0: associated
>> Nov 12 18:40:40 doughnut ntpd[514]: frequency initialized -66.915 PPM from 
>> /home/boot/ntp.drift
>> Nov 12 18:40:43 doughnut kernel: b43-phy0 ERROR: Fatal DMA error: 
>> 0x0800, 0x, 0x, 0x, 0x, 0x
>> Nov 12 18:40:43 doughnut kernel: b43-phy0: Controller RESET (DMA error) ...
>> Nov 12 18:40:43 doughnut kernel: b43: Dump of last 20 DMA descriptors
>> Nov 12 18:40:43 doughnut kernel: b43: Descr.  0: 0x0 0x930 0x364BD020 
>> 0x8000
>> Nov 12 18:40:43 doughnut kernel: b43: Descr.  1: 0x0 0x930 0x364BF020 
>> 0x8000
>> Nov 12 18:40:43 doughnut kernel: b43: Descr.  2: 0x0 0x930 0x364B7020 
>> 0x8000
>> Nov 12 18:40:43 doughnut kernel: b43: Descr.  3: 0x0 0x930 0x364B6020 
>> 0x8000
>> Nov 12 18:40:43 doughnut kernel: b43: Descr.  4: 0x0 0x930 0x364B5020 
>> 0x8000
>> Nov 12 18:40:43 doughnut kernel: b43: Descr.  5: 0x0 0x930 0x364B4020 
>> 0x8000
>> Nov 12 18:40:43 doughnut kernel: b43: Descr.  6: 0x0 0x930 0x364A8020 
>> 0x8000
>> Nov 12 18:40:43 doughnut kernel: b43: Descr.  7: 0x0 0x930 0x364AF020 
>> 0x8000
>> Nov 12 18:40:43 doughnut kernel: b43: Descr.  8: 0x0 0x930 0x364AE020 
>> 0x8000
>> Nov 12 18:40:43 doughnut kernel: b43: Descr.  9: 0x6000 0x74 0x36972428 
>> 0x8000
>> Nov 12 18:40:43 doughnut kernel: b43: Descr. 10: 0x8000 0x6E 0x36A1595A 
>> 0x8000
>> Nov 12 18:40:43 doughnut kernel: b43: Descr. 11: 0x0 0x930 0x364AD020 
>> 0x8000
>> Nov 12 18:40:43 doughnut kernel: b43: Descr. 12: 0x0 0x930 0x364AC020 
>> 0x8000
>> Nov 12 18:40:43 doughnut kernel: b43: Descr. 13: 0x0 0x930 0x364AB020 
>> 0x8000
>> Nov 12 18:40:43 doughnut kernel: b43: Descr. 14: 0x0 0x930 0x364AA020 
>> 0x8000
>> Nov 12 18:40:43 doughnut kernel: b43: Descr. 15: 0x0 0x930 0x364A9020 
>> 0x8000
>> Nov 12 18:40:43 doughnut kernel: b43: Descr. 16: 0x0 0x930 0x36517020 
>> 0x8000
>> Nov 12 18:40:43 doughnut kernel: b43: Descr. 17: 0x0 0x930 0x36516020 
>> 0x8000
>> Nov 12 18:40:43 doughnut kernel: b43: Descr. 18: 0x0 0x930 0x36515020 
>> 0x8000
>> Nov 12 18:40:43 doughnut kernel: b43: Descr. 19: 0x0 0x930 0x36514020 
>> 0x8000
> 
> This looks OK.
> I guess there's something fishy going on in our descriptor memory allocation.
> We have some very weird workarounds in there which I still think are wrong.
> I'll try to rewrite that stuff and send a patch for testing later...
> 
Hi all

Sorry for the late reply... I seem to have the exact same bug here. Do 
you need more people to run the diagnostic patch?

The computer is an HP Mini 1116R with a Broadcom 4312 chipset.

will...@mini:~$ lspci -vnn | grep 14e4
01:00.0 Network controller [0280]: Broadcom Corporation BCM4312 
802.11b/g [14e4:4315] (rev 01)
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread Larry Finger
On 11/12/2009 12:45 PM, Andrew Benton wrote:
> On 12/11/09 17:14, Larry Finger wrote:
>> I guess I'm a failure at writing diagnostic patches. Until there is a DMA 
>> error,
>> the only effect of the patch is to add a little extra time to the routine 
>> that
>> fills in the descriptor structure, and it adds to the data and code size. If 
>> any
>> of those changes have the effect of "fixing" your problem, then we have 
>> severe
>> difficulties. I expect that you will get the error sooner or later.
>>
> 
> Well to provoke it a little I recompiled with all the options on the 
> ACPI menu set to y and now it fails to connect like so:-
> 
> 
> Nov 12 18:40:38 doughnut kernel: wlan0: deauthenticating from 
> 00:1e:2a:27:7e:62 by local choice (reason=3)
> Nov 12 18:40:38 doughnut kernel: wlan0: direct probe to AP 00:1e:2a:27:7e:62 
> (try 1)
> Nov 12 18:40:38 doughnut kernel: wlan0: direct probe responded
> Nov 12 18:40:38 doughnut kernel: wlan0: authenticate with AP 
> 00:1e:2a:27:7e:62 (try 1)
> Nov 12 18:40:38 doughnut kernel: wlan0: authenticated
> Nov 12 18:40:38 doughnut kernel: wlan0: associate with AP 00:1e:2a:27:7e:62 
> (try 1)
> Nov 12 18:40:38 doughnut kernel: wlan0: RX AssocResp from 00:1e:2a:27:7e:62 
> (capab=0x411 status=0 aid=1)
> Nov 12 18:40:38 doughnut kernel: wlan0: associated
> Nov 12 18:40:40 doughnut ntpd[514]: frequency initialized -66.915 PPM from 
> /home/boot/ntp.drift
> Nov 12 18:40:43 doughnut kernel: b43-phy0 ERROR: Fatal DMA error: 0x0800, 
> 0x, 0x, 0x, 0x, 0x
> Nov 12 18:40:43 doughnut kernel: b43-phy0: Controller RESET (DMA error) ...
> Nov 12 18:40:43 doughnut kernel: b43: Dump of last 20 DMA descriptors
> Nov 12 18:40:43 doughnut kernel: b43: Descr.  0: 0x0 0x930 0x364BD020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr.  1: 0x0 0x930 0x364BF020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr.  2: 0x0 0x930 0x364B7020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr.  3: 0x0 0x930 0x364B6020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr.  4: 0x0 0x930 0x364B5020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr.  5: 0x0 0x930 0x364B4020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr.  6: 0x0 0x930 0x364A8020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr.  7: 0x0 0x930 0x364AF020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr.  8: 0x0 0x930 0x364AE020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr.  9: 0x6000 0x74 0x36972428 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr. 10: 0x8000 0x6E 0x36A1595A 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr. 11: 0x0 0x930 0x364AD020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr. 12: 0x0 0x930 0x364AC020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr. 13: 0x0 0x930 0x364AB020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr. 14: 0x0 0x930 0x364AA020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr. 15: 0x0 0x930 0x364A9020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr. 16: 0x0 0x930 0x36517020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr. 17: 0x0 0x930 0x36516020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr. 18: 0x0 0x930 0x36515020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr. 19: 0x0 0x930 0x36514020 
> 0x8000

Now we have some progress. You will note the difference in the control words
(first 2 columns) for descriptors 8 & 9. They are wrong.

While Michael is coming up with a test patch, I'll nose around in the code. He
understands this part better than I, but even the blind squirrel gets a nut once
in a while.

Larry
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread Michael Buesch
On Thursday 12 November 2009 19:45:45 Andrew Benton wrote:
> On 12/11/09 17:14, Larry Finger wrote:
> > I guess I'm a failure at writing diagnostic patches. Until there is a DMA 
> > error,
> > the only effect of the patch is to add a little extra time to the routine 
> > that
> > fills in the descriptor structure, and it adds to the data and code size. 
> > If any
> > of those changes have the effect of "fixing" your problem, then we have 
> > severe
> > difficulties. I expect that you will get the error sooner or later.
> > 
> 
> Well to provoke it a little I recompiled with all the options on the 
> ACPI menu set to y and now it fails to connect like so:-
> 
> 
> Nov 12 18:40:38 doughnut kernel: wlan0: deauthenticating from 
> 00:1e:2a:27:7e:62 by local choice (reason=3)
> Nov 12 18:40:38 doughnut kernel: wlan0: direct probe to AP 00:1e:2a:27:7e:62 
> (try 1)
> Nov 12 18:40:38 doughnut kernel: wlan0: direct probe responded
> Nov 12 18:40:38 doughnut kernel: wlan0: authenticate with AP 
> 00:1e:2a:27:7e:62 (try 1)
> Nov 12 18:40:38 doughnut kernel: wlan0: authenticated
> Nov 12 18:40:38 doughnut kernel: wlan0: associate with AP 00:1e:2a:27:7e:62 
> (try 1)
> Nov 12 18:40:38 doughnut kernel: wlan0: RX AssocResp from 00:1e:2a:27:7e:62 
> (capab=0x411 status=0 aid=1)
> Nov 12 18:40:38 doughnut kernel: wlan0: associated
> Nov 12 18:40:40 doughnut ntpd[514]: frequency initialized -66.915 PPM from 
> /home/boot/ntp.drift
> Nov 12 18:40:43 doughnut kernel: b43-phy0 ERROR: Fatal DMA error: 0x0800, 
> 0x, 0x, 0x, 0x, 0x
> Nov 12 18:40:43 doughnut kernel: b43-phy0: Controller RESET (DMA error) ...
> Nov 12 18:40:43 doughnut kernel: b43: Dump of last 20 DMA descriptors
> Nov 12 18:40:43 doughnut kernel: b43: Descr.  0: 0x0 0x930 0x364BD020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr.  1: 0x0 0x930 0x364BF020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr.  2: 0x0 0x930 0x364B7020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr.  3: 0x0 0x930 0x364B6020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr.  4: 0x0 0x930 0x364B5020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr.  5: 0x0 0x930 0x364B4020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr.  6: 0x0 0x930 0x364A8020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr.  7: 0x0 0x930 0x364AF020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr.  8: 0x0 0x930 0x364AE020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr.  9: 0x6000 0x74 0x36972428 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr. 10: 0x8000 0x6E 0x36A1595A 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr. 11: 0x0 0x930 0x364AD020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr. 12: 0x0 0x930 0x364AC020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr. 13: 0x0 0x930 0x364AB020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr. 14: 0x0 0x930 0x364AA020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr. 15: 0x0 0x930 0x364A9020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr. 16: 0x0 0x930 0x36517020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr. 17: 0x0 0x930 0x36516020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr. 18: 0x0 0x930 0x36515020 
> 0x8000
> Nov 12 18:40:43 doughnut kernel: b43: Descr. 19: 0x0 0x930 0x36514020 
> 0x8000

This looks OK.
I guess there's something fishy going on in our descriptor memory allocation.
We have some very weird workarounds in there which I still think are wrong.
I'll try to rewrite that stuff and send a patch for testing later...

-- 
Greetings, Michael.
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread Andrew Benton
On 12/11/09 17:14, Larry Finger wrote:
> I guess I'm a failure at writing diagnostic patches. Until there is a DMA 
> error,
> the only effect of the patch is to add a little extra time to the routine that
> fills in the descriptor structure, and it adds to the data and code size. If 
> any
> of those changes have the effect of "fixing" your problem, then we have severe
> difficulties. I expect that you will get the error sooner or later.
> 

Well to provoke it a little I recompiled with all the options on the 
ACPI menu set to y and now it fails to connect like so:-


Nov 12 18:40:38 doughnut kernel: wlan0: deauthenticating from 00:1e:2a:27:7e:62 
by local choice (reason=3)
Nov 12 18:40:38 doughnut kernel: wlan0: direct probe to AP 00:1e:2a:27:7e:62 
(try 1)
Nov 12 18:40:38 doughnut kernel: wlan0: direct probe responded
Nov 12 18:40:38 doughnut kernel: wlan0: authenticate with AP 00:1e:2a:27:7e:62 
(try 1)
Nov 12 18:40:38 doughnut kernel: wlan0: authenticated
Nov 12 18:40:38 doughnut kernel: wlan0: associate with AP 00:1e:2a:27:7e:62 
(try 1)
Nov 12 18:40:38 doughnut kernel: wlan0: RX AssocResp from 00:1e:2a:27:7e:62 
(capab=0x411 status=0 aid=1)
Nov 12 18:40:38 doughnut kernel: wlan0: associated
Nov 12 18:40:40 doughnut ntpd[514]: frequency initialized -66.915 PPM from 
/home/boot/ntp.drift
Nov 12 18:40:43 doughnut kernel: b43-phy0 ERROR: Fatal DMA error: 0x0800, 
0x, 0x, 0x, 0x, 0x
Nov 12 18:40:43 doughnut kernel: b43-phy0: Controller RESET (DMA error) ...
Nov 12 18:40:43 doughnut kernel: b43: Dump of last 20 DMA descriptors
Nov 12 18:40:43 doughnut kernel: b43: Descr.  0: 0x0 0x930 0x364BD020 0x8000
Nov 12 18:40:43 doughnut kernel: b43: Descr.  1: 0x0 0x930 0x364BF020 0x8000
Nov 12 18:40:43 doughnut kernel: b43: Descr.  2: 0x0 0x930 0x364B7020 0x8000
Nov 12 18:40:43 doughnut kernel: b43: Descr.  3: 0x0 0x930 0x364B6020 0x8000
Nov 12 18:40:43 doughnut kernel: b43: Descr.  4: 0x0 0x930 0x364B5020 0x8000
Nov 12 18:40:43 doughnut kernel: b43: Descr.  5: 0x0 0x930 0x364B4020 0x8000
Nov 12 18:40:43 doughnut kernel: b43: Descr.  6: 0x0 0x930 0x364A8020 0x8000
Nov 12 18:40:43 doughnut kernel: b43: Descr.  7: 0x0 0x930 0x364AF020 0x8000
Nov 12 18:40:43 doughnut kernel: b43: Descr.  8: 0x0 0x930 0x364AE020 0x8000
Nov 12 18:40:43 doughnut kernel: b43: Descr.  9: 0x6000 0x74 0x36972428 
0x8000
Nov 12 18:40:43 doughnut kernel: b43: Descr. 10: 0x8000 0x6E 0x36A1595A 
0x8000
Nov 12 18:40:43 doughnut kernel: b43: Descr. 11: 0x0 0x930 0x364AD020 0x8000
Nov 12 18:40:43 doughnut kernel: b43: Descr. 12: 0x0 0x930 0x364AC020 0x8000
Nov 12 18:40:43 doughnut kernel: b43: Descr. 13: 0x0 0x930 0x364AB020 0x8000
Nov 12 18:40:43 doughnut kernel: b43: Descr. 14: 0x0 0x930 0x364AA020 0x8000
Nov 12 18:40:43 doughnut kernel: b43: Descr. 15: 0x0 0x930 0x364A9020 0x8000
Nov 12 18:40:43 doughnut kernel: b43: Descr. 16: 0x0 0x930 0x36517020 0x8000
Nov 12 18:40:43 doughnut kernel: b43: Descr. 17: 0x0 0x930 0x36516020 0x8000
Nov 12 18:40:43 doughnut kernel: b43: Descr. 18: 0x0 0x930 0x36515020 0x8000
Nov 12 18:40:43 doughnut kernel: b43: Descr. 19: 0x0 0x930 0x36514020 0x8000
Nov 12 18:40:43 doughnut kernel: b43-phy0: Loading firmware version 410.2160 
(2007-05-26 15:32:10)
Nov 12 18:40:49 doughnut kernel: b43-phy0: Controller restarted
Nov 12 18:40:49 doughnut kernel: b43-phy0 ERROR: Fatal DMA error: 0x0400, 
0x, 0x, 0x, 0x, 0x
Nov 12 18:40:49 doughnut kernel: b43-phy0: Controller RESET (DMA error) ...
Nov 12 18:40:49 doughnut kernel: b43-phy0: Loading firmware version 410.2160 
(2007-05-26 15:32:10)
Nov 12 18:40:55 doughnut kernel: b43-phy0: Controller restarted
Nov 12 18:40:55 doughnut kernel: b43-phy0 ERROR: Fatal DMA error: 0x0400, 
0x, 0x, 0x, 0x, 0x
Nov 12 18:40:55 doughnut kernel: b43-phy0: Controller RESET (DMA error) ...
Nov 12 18:40:55 doughnut kernel: b43-phy0: Loading firmware version 410.2160 
(2007-05-26 15:32:10)
Nov 12 18:41:00 doughnut kernel: b43-phy0: Controller restarted
Nov 12 18:41:00 doughnut kernel: b43-phy0 ERROR: Fatal DMA error: 0x0400, 
0x, 0x, 0x, 0x, 0x
Nov 12 18:41:00 doughnut kernel: b43-phy0: Controller RESET (DMA error) ...
Nov 12 18:41:01 doughnut kernel: b43-phy0: Loading firmware version 410.2160 
(2007-05-26 15:32:10)
Nov 12 18:41:06 doughnut kernel: b43-phy0: Controller restarted
Nov 12 18:41:06 doughnut kernel: b43-phy0 ERROR: Fatal DMA error: 0x0400, 
0x, 0x, 0x0400, 0x, 0x
Nov 12 18:41:06 doughnut kernel: b43-phy0: Controller RESET (DMA error) ...
Nov 12 18:41:06 doughnut kernel: b43-phy0: Loading firmware version 410.2160 
(2007-05-26 15:32:10)
Nov 12 18:41:12 doughnut kernel: b43-phy0: Controller restarted
Nov 12 18:41:12 doughnut kernel: b43-phy0 ERROR: F

Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread Larry Finger
On 11/12/2009 10:53 AM, Michael Buesch wrote:
> On Thursday 12 November 2009 17:42:07 Andrew Benton wrote:
>> On 12/11/09 15:50, Larry Finger wrote:
>>> Sorry about the kernel mismatch. I developed that patch while offline and
>>> waiting at an auto repair place and forgot to refresh my sources before 
>>> sending
>>> it. The revised version that Michael sent should work. If not, please let me
>>> know and I will send an updated version.
>>
>> Indeed, it works. I've just been browsing for more than an hour with no 
>> problems or error messages in the syslog. I've got to do some housework 
>> now. I'll test it some more later.
>> Thanks!
> 
> Ehm, this patch is not supposed to fix anything. It's just a debugging patch 
> that's
> more verbose in the failure case.

I guess I'm a failure at writing diagnostic patches. Until there is a DMA error,
the only effect of the patch is to add a little extra time to the routine that
fills in the descriptor structure, and it adds to the data and code size. If any
of those changes have the effect of "fixing" your problem, then we have severe
difficulties. I expect that you will get the error sooner or later.

Larry
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread Michael Buesch
On Thursday 12 November 2009 17:42:07 Andrew Benton wrote:
> On 12/11/09 15:50, Larry Finger wrote:
> > Sorry about the kernel mismatch. I developed that patch while offline and
> > waiting at an auto repair place and forgot to refresh my sources before 
> > sending
> > it. The revised version that Michael sent should work. If not, please let me
> > know and I will send an updated version.
> 
> Indeed, it works. I've just been browsing for more than an hour with no 
> problems or error messages in the syslog. I've got to do some housework 
> now. I'll test it some more later.
> Thanks!

Ehm, this patch is not supposed to fix anything. It's just a debugging patch 
that's
more verbose in the failure case.

-- 
Greetings, Michael.
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread Andrew Benton
On 12/11/09 15:50, Larry Finger wrote:
> Sorry about the kernel mismatch. I developed that patch while offline and
> waiting at an auto repair place and forgot to refresh my sources before 
> sending
> it. The revised version that Michael sent should work. If not, please let me
> know and I will send an updated version.

Indeed, it works. I've just been browsing for more than an hour with no 
problems or error messages in the syslog. I've got to do some housework 
now. I'll test it some more later.
Thanks!

Andy
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread Larry Finger
On 11/12/2009 06:16 AM, Andrew Benton wrote:
> On 12/11/09 00:37, Larry Finger wrote:
>> Andy,
>>
>> Please try the patch below to see what we can learn from the DMA descriptor
>> errors. Some of this code is temporary, but there are also some statements 
>> that
>> will probably become permanent.
>>
>> Please post any messages that result.
> 
> The patch failed
> 
> patching file drivers/net/wireless/b43/dma.c
> Hunk #3 FAILED at 1224.
> Hunk #4 FAILED at 1247.
> Hunk #5 succeeded at 1625 (offset -8 lines).
> 2 out of 5 hunks FAILED -- saving rejects to file 
> drivers/net/wireless/b43/dma.c.rej
> patching file drivers/net/wireless/b43/dma.h
> patching file drivers/net/wireless/b43/main.c
> andy:~$
> 
> I tried to apply it to the wireless-testing kernel
> 
> This line from your patch:-
> bounce_skb = __dev_alloc_skb(skb->len, GFP_ATOMIC | GFP_DMA);
> 
> isn't in my copy of drivers/net/wireless/b43/dma.c, in mine it looks 
> like this:-
> priv_info->bouncebuffer = kmalloc(skb->len, GFP_ATOMIC | GFP_DMA);
> 
> Should I be working with a different kernel source?

Sorry about the kernel mismatch. I developed that patch while offline and
waiting at an auto repair place and forgot to refresh my sources before sending
it. The revised version that Michael sent should work. If not, please let me
know and I will send an updated version.

Thanks for memory testing. We can eliminate the RAM as a source of the failure.

Larry
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread Michael Buesch
On Thursday 12 November 2009 13:16:31 Andrew Benton wrote:
> On 12/11/09 00:37, Larry Finger wrote:
> > Andy,
> >
> > Please try the patch below to see what we can learn from the DMA descriptor
> > errors. Some of this code is temporary, but there are also some statements 
> > that
> > will probably become permanent.
> >
> > Please post any messages that result.
> 
> The patch failed

Larry had an outdated tree.
This patch applies against wireless-testing:

>From larry.fin...@lwfinger.net Thu Nov 12 01:37:32 2009
Return-path: 
Envelope-to: m...@bu3sch.de
Delivery-date: Thu, 12 Nov 2009 00:38:24 +
Received: by vs166246.vserver.de with esmtp (Exim 4.69)
id 1N8Nhg-zK-T1
for m...@bu3sch.de; Thu, 12 Nov 2009 00:38:24 +
Received: from bat.berlios.de (localhost [127.0.0.1])
by mail.berlios.de (Postfix) with ESMTP id 2A1D719E79B;
Thu, 12 Nov 2009 01:38:05 +0100 (CET)
X-Original-To: bcm43xx-dev@lists.berlios.de
Delivered-To: bcm43xx-dev@lists.berlios.de
Received: from mail-yx0-f174.google.com (mail-yx0-f174.google.com
[209.85.210.174])
by mail.berlios.de (Postfix) with ESMTP id 22BCAB3811
for ;
Thu, 12 Nov 2009 01:37:35 +0100 (CET)
Received: by yxe4 with SMTP id 4so1496946yxe.32
for ;
Wed, 11 Nov 2009 16:37:34 -0800 (PST)
Received: by 10.90.10.9 with SMTP id 9mr3391390agj.69.1257986254045;
Wed, 11 Nov 2009 16:37:34 -0800 (PST)
Received: from ?192.168.2.217? ([65.28.92.235])
by mx.google.com with ESMTPS id 4sm1044525yxd.70.2009.11.11.16.37.32
(version=SSLv3 cipher=RC4-MD5); Wed, 11 Nov 2009 16:37:33 -0800 (PST)
Message-ID: <4afb58cc.6010...@lwfinger.net>
Date: Wed, 11 Nov 2009 18:37:32 -0600
From: Larry Finger 
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US;
rv:1.9.1.4pre) Gecko/20090915 SUSE/3.0b4-3.6 Thunderbird/3.0b4
MIME-Version: 1.0
To: Andrew Benton 
Subject: Re: b43-phy0 ERROR: Fatal DMA error: 0x0400
References: <4afa09c8.4060...@gmail.com>
In-Reply-To: <4afa09c8.4060...@gmail.com>
Cc: bcm43xx-dev@lists.berlios.de,
 Michael Buesch 
X-BeenThere: bcm43xx-dev@lists.berlios.de
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: 
List-Unsubscribe: <https://lists.berlios.de/mailman/listinfo/bcm43xx-dev>,
<mailto:bcm43xx-dev-requ...@lists.berlios.de?subject=unsubscribe>
List-Archive: <https://lists.berlios.de/pipermail/bcm43xx-dev>
List-Post: <mailto:bcm43xx-dev@lists.berlios.de>
List-Help: <mailto:bcm43xx-dev-requ...@lists.berlios.de?subject=help>
List-Subscribe: <https://lists.berlios.de/mailman/listinfo/bcm43xx-dev>,
<mailto:bcm43xx-dev-requ...@lists.berlios.de?subject=subscribe>
Content-Type: text/plain;
  charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: bcm43xx-dev-boun...@lists.berlios.de
Errors-To: bcm43xx-dev-boun...@lists.berlios.de
X-Length: 6628
X-UID: 8122

Andy,

Please try the patch below to see what we can learn from the DMA descriptor
errors. Some of this code is temporary, but there are also some statements that
will probably become permanent.

Please post any messages that result.

Larry


---
 drivers/net/wireless/b43/dma.c  |   32 
 drivers/net/wireless/b43/dma.h  |1 +
 drivers/net/wireless/b43/main.c |1 +
 3 files changed, 34 insertions(+)

--- wireless-testing.orig/drivers/net/wireless/b43/dma.c
+++ wireless-testing/drivers/net/wireless/b43/dma.c
@@ -46,6 +46,8 @@
  * into separate slots. */
 #define TX_SLOTS_PER_FRAME 2
 
+int dma_point = 0;
+struct b43_dmadesc_generic dma_desc_save[20];
 
 /* 32bit DMA ops. */
 static
@@ -190,6 +192,12 @@ static void op64_fill_descriptor(struct 
desc->dma64.control1 = cpu_to_le32(ctl1);
desc->dma64.address_low = cpu_to_le32(addrlo);
desc->dma64.address_high = cpu_to_le32(addrhi);
+   dma_desc_save[dma_point].dma64.control0 = desc->dma64.control0;
+   dma_desc_save[dma_point].dma64.control1 = desc->dma64.control1;
+   dma_desc_save[dma_point].dma64.address_low = desc->dma64.address_low;
+   dma_desc_save[dma_point].dma64.address_high = desc->dma64.address_high;
+   if (++dma_point >= 20)
+   dma_point = 0;
 }
 
 static void op64_poke_tx(struct b43_dmaring *ring, int slot)
@@ -1216,8 +1224,11 @@ static int dma_tx_fragment(struct b43_dm
meta->dmaaddr = map_descbuffer(ring, skb->data, skb->len, 1);
/* create a bounce buffer in zone_dma on mapping failure. */
if (b43_dma_mapping_error(ring, meta->dmaaddr, skb->len, 1)) {
+   printk(KERN_INFO "b43: Using bounce buffer\n");
priv_info->bouncebuffer = kmalloc(skb->len, GFP_ATOMIC | 
GFP_DMA);
if (!priv_info->bouncebuffer) {
+   b43warn(ring->dev->wl, "Bounce buffer allocation &qu

Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread Andrew Benton
On 12/11/09 00:37, Larry Finger wrote:
> Andy,
>
> Please try the patch below to see what we can learn from the DMA descriptor
> errors. Some of this code is temporary, but there are also some statements 
> that
> will probably become permanent.
>
> Please post any messages that result.

The patch failed

patching file drivers/net/wireless/b43/dma.c
Hunk #3 FAILED at 1224.
Hunk #4 FAILED at 1247.
Hunk #5 succeeded at 1625 (offset -8 lines).
2 out of 5 hunks FAILED -- saving rejects to file 
drivers/net/wireless/b43/dma.c.rej
patching file drivers/net/wireless/b43/dma.h
patching file drivers/net/wireless/b43/main.c
andy:~$

I tried to apply it to the wireless-testing kernel

This line from your patch:-
bounce_skb = __dev_alloc_skb(skb->len, GFP_ATOMIC | GFP_DMA);

isn't in my copy of drivers/net/wireless/b43/dma.c, in mine it looks 
like this:-
priv_info->bouncebuffer = kmalloc(skb->len, GFP_ATOMIC | GFP_DMA);

Should I be working with a different kernel source?

Andy
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread Michael Buesch
On Thursday 12 November 2009 12:42:47 Andrew Benton wrote:
> On 11/11/09 19:12, Larry Finger wrote:
> >
> > Such an error in SLUB handling could be arising from a DMA problem in b43, 
> > but
> > it could also arise from a memory error. Please run memtest86+ for an 
> > extended
> > period so that a hardware error can be ruled out. A 24 hour run would be 
> > good.
> > If that is not possible, please do at least a 12 hour test.
> >
> 
> I've run memtest86+ for more than 12 hours, so far with no errors. I'll 
> run it a while longer but I don't think there's anything wrong with the 
> hardware.

I'd rather suggest trying larry's patch instead of running another memtest 
round.

-- 
Greetings, Michael.
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-12 Thread Andrew Benton
On 11/11/09 19:12, Larry Finger wrote:
>
> Such an error in SLUB handling could be arising from a DMA problem in b43, but
> it could also arise from a memory error. Please run memtest86+ for an extended
> period so that a hardware error can be ruled out. A 24 hour run would be good.
> If that is not possible, please do at least a 12 hour test.
>

I've run memtest86+ for more than 12 hours, so far with no errors. I'll 
run it a while longer but I don't think there's anything wrong with the 
hardware.

Andy
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-11 Thread Larry Finger
Andy,

Please try the patch below to see what we can learn from the DMA descriptor
errors. Some of this code is temporary, but there are also some statements that
will probably become permanent.

Please post any messages that result.

Larry


Index: wireless-testing/drivers/net/wireless/b43/dma.c
===
--- wireless-testing.orig/drivers/net/wireless/b43/dma.c
+++ wireless-testing/drivers/net/wireless/b43/dma.c
@@ -46,6 +46,8 @@
  * into separate slots. */
 #define TX_SLOTS_PER_FRAME 2

+int dma_point = 0;
+struct b43_dmadesc_generic dma_desc_save[20];

 /* 32bit DMA ops. */
 static
@@ -190,6 +192,12 @@ static void op64_fill_descriptor(struct
desc->dma64.control1 = cpu_to_le32(ctl1);
desc->dma64.address_low = cpu_to_le32(addrlo);
desc->dma64.address_high = cpu_to_le32(addrhi);
+   dma_desc_save[dma_point].dma64.control0 = desc->dma64.control0;
+   dma_desc_save[dma_point].dma64.control1 = desc->dma64.control1;
+   dma_desc_save[dma_point].dma64.address_low = desc->dma64.address_low;
+   dma_desc_save[dma_point].dma64.address_high = desc->dma64.address_high;
+   if (++dma_point >= 20)
+   dma_point = 0;
 }

 static void op64_poke_tx(struct b43_dmaring *ring, int slot)
@@ -1216,8 +1224,11 @@ static int dma_tx_fragment(struct b43_dm
meta->dmaaddr = map_descbuffer(ring, skb->data, skb->len, 1);
/* create a bounce buffer in zone_dma on mapping failure. */
if (b43_dma_mapping_error(ring, meta->dmaaddr, skb->len, 1)) {
+   printk(KERN_INFO "b43: Using bounce buffer\n");
bounce_skb = __dev_alloc_skb(skb->len, GFP_ATOMIC | GFP_DMA);
if (!bounce_skb) {
+   b43warn(ring->dev->wl, "Bounce buffer allocation "
+   "failed\n");
ring->current_slot = old_top_slot;
ring->used_slots = old_used_slots;
err = -ENOMEM;
@@ -1236,6 +1247,8 @@ static int dma_tx_fragment(struct b43_dm
meta->skb = skb;
meta->dmaaddr = map_descbuffer(ring, skb->data, skb->len, 1);
if (b43_dma_mapping_error(ring, meta->dmaaddr, skb->len, 1)) {
+   b43warn(ring->dev->wl, "DMA mapping error for bounce "
+   "buffer\n");
ring->current_slot = old_top_slot;
ring->used_slots = old_used_slots;
err = -EIO;
@@ -1620,6 +1633,25 @@ void b43_dma_tx_resume(struct b43_wldev
b43_power_saving_ctl_bits(dev, 0);
 }

+void b43_dump_desc_buffer(void)
+{
+   /* dump the descriptor buffer once */
+   int i, j = dma_point;
+   static int once = 0;
+
+   if (once)
+   return;
+   printk(KERN_INFO "b43: Dump of last 20 DMA descriptors\n");
+   for (i = 0; i < 20; i++) {
+   if (--j < 0)
+   j = 19;
+   printk(KERN_INFO "b43: Descr. %2d: 0x%x 0x%X 0x%X 0x%X\n", i,
+  dma_desc_save[j].dma64.control0, 
dma_desc_save[j].dma64.control1,
+  dma_desc_save[j].dma64.address_low, 
dma_desc_save[j].dma64.address_high);
+   }
+   once++;
+}
+
 #ifdef CONFIG_B43_PIO
 static void direct_fifo_rx(struct b43_wldev *dev, enum b43_dmatype type,
   u16 mmio_base, bool enable)
Index: wireless-testing/drivers/net/wireless/b43/dma.h
===
--- wireless-testing.orig/drivers/net/wireless/b43/dma.h
+++ wireless-testing/drivers/net/wireless/b43/dma.h
@@ -287,4 +287,5 @@ void b43_dma_rx(struct b43_dmaring *ring
 void b43_dma_direct_fifo_rx(struct b43_wldev *dev,
unsigned int engine_index, bool enable);

+void b43_dump_desc_buffer(void);
 #endif /* B43_DMA_H_ */
Index: wireless-testing/drivers/net/wireless/b43/main.c
===
--- wireless-testing.orig/drivers/net/wireless/b43/main.c
+++ wireless-testing/drivers/net/wireless/b43/main.c
@@ -1785,6 +1785,7 @@ static void b43_do_interrupt_thread(stru
   dma_reason[2], dma_reason[3],
   dma_reason[4], dma_reason[5]);
b43_controller_restart(dev, "DMA error");
+   b43_dump_desc_buffer();
return;
}
if (merged_dma_reason & B43_DMAIRQ_NONFATALMASK) {

___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-11 Thread Michael Buesch
On Wednesday 11 November 2009 20:12:59 Larry Finger wrote:
> On 11/11/2009 08:52 AM, Andrew Benton wrote:
> > I recompiled my kernel today (current wireless-testing) and disabled ACPI 
> > entirely.
> > It worked fine for 10 minutes and then the internet connection died and 
> > left this
> > in /var/log/sys.log
> > 
> > Nov 11 14:31:31 doughnut ntpd[398]: kernel time sync status change 2001
> > Nov 11 14:36:57 doughnut ntpd[398]: synchronized to 130.88.200.4, stratum 2
> > Nov 11 14:37:31 doughnut kernel: [ cut here ]
> > Nov 11 14:37:31 doughnut kernel: kernel BUG at mm/slub.c:2969!
> > Nov 11 14:37:31 doughnut kernel: invalid opcode:  [#1] SMP
> > Nov 11 14:37:31 doughnut kernel: last sysfs file: 
> > /sys/devices/pci:00/:00:02.1/resource
> > Nov 11 14:37:31 doughnut kernel: Modules linked in:
> > Nov 11 14:37:31 doughnut kernel:
> > Nov 11 14:37:31 doughnut kernel: Pid: 343, comm: irq/17-b43 Not tainted 
> > (2.6.32-rc6-wl #1) Inspiron 910
> > Nov 11 14:37:31 doughnut kernel: EIP: 0060:[] EFLAGS: 00010246 
> > CPU: 0
> > Nov 11 14:37:31 doughnut kernel: EIP is at kfree+0xa9/0xb0
> > Nov 11 14:37:31 doughnut kernel: EAX: dededede EBX: f68f8200 ECX: 4000 
> > EDX: c19b9da0
> > Nov 11 14:37:31 doughnut kernel: ESI: ef00 EDI: 0400 EBP: f72c5400 
> > ESP: f6a3ded0
> > Nov 11 14:37:31 doughnut kernel:  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 
> > 0068
> > Nov 11 14:37:31 doughnut kernel: Process irq/17-b43 (pid: 343, ti=f6a3c000 
> > task=f73fa380 task.ti=f6a3c000)
> > Nov 11 14:37:31 doughnut kernel: Stack:
> > Nov 11 14:37:31 doughnut kernel:  000e7ef0 c1021c31 f68f8200 ef00 
> > 0400 c12d47ce c13ee7c0 f73fa380
> > Nov 11 14:37:31 doughnut kernel: <0> 7fff7fff dededede  c141c934 
> > f7093458 f6a3df64 f73b7000 f72c5400
> > Nov 11 14:37:31 doughnut kernel: <0> f72c5400 f6a3df64  c12d0556 
> >  c12c0b77 0046 0046
> > Nov 11 14:37:31 doughnut kernel: Call Trace:
> > Nov 11 14:37:31 doughnut kernel:  [] ? update_curr_rt+0x251/0x2c0
> > Nov 11 14:37:31 doughnut kernel:  [] ? 
> > b43_dma_handle_txstatus+0xbe/0x270
> > Nov 11 14:37:31 doughnut kernel:  [] ? 
> > b43_handle_txstatus+0x36/0x60
> > Nov 11 14:37:31 doughnut kernel:  [] ? 
> > b43_do_interrupt_thread+0x1d7/0x5d0
> > Nov 11 14:37:31 doughnut kernel:  [] ? 
> > b43_interrupt_thread_handler+0x15/0x30
> > Nov 11 14:37:31 doughnut kernel:  [] ? irq_thread+0x104/0x1d0
> > Nov 11 14:37:31 doughnut kernel:  [] ? complete+0x40/0x60
> > Nov 11 14:37:31 doughnut kernel:  [] ? irq_thread+0x0/0x1d0
> > Nov 11 14:37:31 doughnut kernel:  [] ? kthread+0x74/0x80
> > Nov 11 14:37:31 doughnut kernel:  [] ? kthread+0x0/0x80
> > Nov 11 14:37:31 doughnut kernel:  [] ? 
> > kernel_thread_helper+0x7/0x18
> > Nov 11 14:37:31 doughnut kernel: Code: e8 1d fc ff ff eb d9 66 f7 c1 00 c0 
> > 74 1d 8b 5c 24 08 89 d0 8b 74 24 0c 8b 7c 24 10 83 c4 14 e9 8e 24 fe ff 8b 
> > 52 0c 8b 0a eb 84 <0f> 0b eb fe 8d 76 00 83 e8 60 e9 48 ff ff ff 90 8d b4 
> > 26 00 00
> > Nov 11 14:37:31 doughnut kernel: EIP: [] kfree+0xa9/0xb0 SS:ESP 
> > 0068:f6a3ded0
> > Nov 11 14:37:31 doughnut kernel: ---[ end trace 021257f2296ca88f ]---
> > Nov 11 14:37:31 doughnut kernel: exiting task "irq/17-b43" (343) is an 
> > active IRQ thread (irq 17)
> > 
> > Hopefully this means something.
> > It makes a change from the "b43-phy0 ERROR: Fatal DMA error"
> 
> Yes it does.
> 
> Such an error in SLUB handling could be arising from a DMA problem in b43, but
> it could also arise from a memory error. Please run memtest86+ for an extended
> period so that a hardware error can be ruled out. A 24 hour run would be good.
> If that is not possible, please do at least a 12 hour test.


Do we have some slub expert who can tell us what this BUG() does actually mean?
I can't make any sense of it, because I don't know the meaning of those page 
flags.


-- 
Greetings, Michael.
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-11 Thread Larry Finger
On 11/11/2009 08:52 AM, Andrew Benton wrote:
> I recompiled my kernel today (current wireless-testing) and disabled ACPI 
> entirely.
> It worked fine for 10 minutes and then the internet connection died and left 
> this
> in /var/log/sys.log
> 
> Nov 11 14:31:31 doughnut ntpd[398]: kernel time sync status change 2001
> Nov 11 14:36:57 doughnut ntpd[398]: synchronized to 130.88.200.4, stratum 2
> Nov 11 14:37:31 doughnut kernel: [ cut here ]
> Nov 11 14:37:31 doughnut kernel: kernel BUG at mm/slub.c:2969!
> Nov 11 14:37:31 doughnut kernel: invalid opcode:  [#1] SMP
> Nov 11 14:37:31 doughnut kernel: last sysfs file: 
> /sys/devices/pci:00/:00:02.1/resource
> Nov 11 14:37:31 doughnut kernel: Modules linked in:
> Nov 11 14:37:31 doughnut kernel:
> Nov 11 14:37:31 doughnut kernel: Pid: 343, comm: irq/17-b43 Not tainted 
> (2.6.32-rc6-wl #1) Inspiron 910
> Nov 11 14:37:31 doughnut kernel: EIP: 0060:[] EFLAGS: 00010246 CPU: > 0
> Nov 11 14:37:31 doughnut kernel: EIP is at kfree+0xa9/0xb0
> Nov 11 14:37:31 doughnut kernel: EAX: dededede EBX: f68f8200 ECX: 4000 
> EDX: c19b9da0
> Nov 11 14:37:31 doughnut kernel: ESI: ef00 EDI: 0400 EBP: f72c5400 
> ESP: f6a3ded0
> Nov 11 14:37:31 doughnut kernel:  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> Nov 11 14:37:31 doughnut kernel: Process irq/17-b43 (pid: 343, ti=f6a3c000 
> task=f73fa380 task.ti=f6a3c000)
> Nov 11 14:37:31 doughnut kernel: Stack:
> Nov 11 14:37:31 doughnut kernel:  000e7ef0 c1021c31 f68f8200 ef00 
> 0400 c12d47ce c13ee7c0 f73fa380
> Nov 11 14:37:31 doughnut kernel: <0> 7fff7fff dededede  c141c934 
> f7093458 f6a3df64 f73b7000 f72c5400
> Nov 11 14:37:31 doughnut kernel: <0> f72c5400 f6a3df64  c12d0556 
>  c12c0b77 0046 0046
> Nov 11 14:37:31 doughnut kernel: Call Trace:
> Nov 11 14:37:31 doughnut kernel:  [] ? update_curr_rt+0x251/0x2c0
> Nov 11 14:37:31 doughnut kernel:  [] ? 
> b43_dma_handle_txstatus+0xbe/0x270
> Nov 11 14:37:31 doughnut kernel:  [] ? b43_handle_txstatus+0x36/0x60
> Nov 11 14:37:31 doughnut kernel:  [] ? 
> b43_do_interrupt_thread+0x1d7/0x5d0
> Nov 11 14:37:31 doughnut kernel:  [] ? 
> b43_interrupt_thread_handler+0x15/0x30
> Nov 11 14:37:31 doughnut kernel:  [] ? irq_thread+0x104/0x1d0
> Nov 11 14:37:31 doughnut kernel:  [] ? complete+0x40/0x60
> Nov 11 14:37:31 doughnut kernel:  [] ? irq_thread+0x0/0x1d0
> Nov 11 14:37:31 doughnut kernel:  [] ? kthread+0x74/0x80
> Nov 11 14:37:31 doughnut kernel:  [] ? kthread+0x0/0x80
> Nov 11 14:37:31 doughnut kernel:  [] ? kernel_thread_helper+0x7/0x18
> Nov 11 14:37:31 doughnut kernel: Code: e8 1d fc ff ff eb d9 66 f7 c1 00 c0 74 
> 1d 8b 5c 24 08 89 d0 8b 74 24 0c 8b 7c 24 10 83 c4 14 e9 8e 24 fe ff 8b 52 0c 
> 8b 0a eb 84 <0f> 0b eb fe 8d 76 00 83 e8 60 e9 48 ff ff ff 90 8d b4 26 00 00
> Nov 11 14:37:31 doughnut kernel: EIP: [] kfree+0xa9/0xb0 SS:ESP 
> 0068:f6a3ded0
> Nov 11 14:37:31 doughnut kernel: ---[ end trace 021257f2296ca88f ]---
> Nov 11 14:37:31 doughnut kernel: exiting task "irq/17-b43" (343) is an active 
> IRQ thread (irq 17)
> 
> Hopefully this means something.
> It makes a change from the "b43-phy0 ERROR: Fatal DMA error"

Yes it does.

Such an error in SLUB handling could be arising from a DMA problem in b43, but
it could also arise from a memory error. Please run memtest86+ for an extended
period so that a hardware error can be ruled out. A 24 hour run would be good.
If that is not possible, please do at least a 12 hour test.

Larry
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-11 Thread Andrew Benton
I recompiled my kernel today (current wireless-testing) and disabled ACPI 
entirely.
It worked fine for 10 minutes and then the internet connection died and left 
this
in /var/log/sys.log

Nov 11 14:31:31 doughnut ntpd[398]: kernel time sync status change 2001
Nov 11 14:36:57 doughnut ntpd[398]: synchronized to 130.88.200.4, stratum 2
Nov 11 14:37:31 doughnut kernel: [ cut here ]
Nov 11 14:37:31 doughnut kernel: kernel BUG at mm/slub.c:2969!
Nov 11 14:37:31 doughnut kernel: invalid opcode:  [#1] SMP
Nov 11 14:37:31 doughnut kernel: last sysfs file: 
/sys/devices/pci:00/:00:02.1/resource
Nov 11 14:37:31 doughnut kernel: Modules linked in:
Nov 11 14:37:31 doughnut kernel:
Nov 11 14:37:31 doughnut kernel: Pid: 343, comm: irq/17-b43 Not tainted 
(2.6.32-rc6-wl #1) Inspiron 910
Nov 11 14:37:31 doughnut kernel: EIP: 0060:[] EFLAGS: 00010246 CPU: 0
Nov 11 14:37:31 doughnut kernel: EIP is at kfree+0xa9/0xb0
Nov 11 14:37:31 doughnut kernel: EAX: dededede EBX: f68f8200 ECX: 4000 EDX: 
c19b9da0
Nov 11 14:37:31 doughnut kernel: ESI: ef00 EDI: 0400 EBP: f72c5400 ESP: 
f6a3ded0
Nov 11 14:37:31 doughnut kernel:  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Nov 11 14:37:31 doughnut kernel: Process irq/17-b43 (pid: 343, ti=f6a3c000 
task=f73fa380 task.ti=f6a3c000)
Nov 11 14:37:31 doughnut kernel: Stack:
Nov 11 14:37:31 doughnut kernel:  000e7ef0 c1021c31 f68f8200 ef00 0400 
c12d47ce c13ee7c0 f73fa380
Nov 11 14:37:31 doughnut kernel: <0> 7fff7fff dededede  c141c934 
f7093458 f6a3df64 f73b7000 f72c5400
Nov 11 14:37:31 doughnut kernel: <0> f72c5400 f6a3df64  c12d0556 
 c12c0b77 0046 0046
Nov 11 14:37:31 doughnut kernel: Call Trace:
Nov 11 14:37:31 doughnut kernel:  [] ? update_curr_rt+0x251/0x2c0
Nov 11 14:37:31 doughnut kernel:  [] ? 
b43_dma_handle_txstatus+0xbe/0x270
Nov 11 14:37:31 doughnut kernel:  [] ? b43_handle_txstatus+0x36/0x60
Nov 11 14:37:31 doughnut kernel:  [] ? 
b43_do_interrupt_thread+0x1d7/0x5d0
Nov 11 14:37:31 doughnut kernel:  [] ? 
b43_interrupt_thread_handler+0x15/0x30
Nov 11 14:37:31 doughnut kernel:  [] ? irq_thread+0x104/0x1d0
Nov 11 14:37:31 doughnut kernel:  [] ? complete+0x40/0x60
Nov 11 14:37:31 doughnut kernel:  [] ? irq_thread+0x0/0x1d0
Nov 11 14:37:31 doughnut kernel:  [] ? kthread+0x74/0x80
Nov 11 14:37:31 doughnut kernel:  [] ? kthread+0x0/0x80
Nov 11 14:37:31 doughnut kernel:  [] ? kernel_thread_helper+0x7/0x18
Nov 11 14:37:31 doughnut kernel: Code: e8 1d fc ff ff eb d9 66 f7 c1 00 c0 74 
1d 8b 5c 24 08 89 d0 8b 74 24 0c 8b 7c 24 10 83 c4 14 e9 8e 24 fe ff 8b 52 0c 
8b 0a eb 84 <0f> 0b eb fe 8d 76 00 83 e8 60 e9 48 ff ff ff 90 8d b4 26 00 00
Nov 11 14:37:31 doughnut kernel: EIP: [] kfree+0xa9/0xb0 SS:ESP 
0068:f6a3ded0
Nov 11 14:37:31 doughnut kernel: ---[ end trace 021257f2296ca88f ]---
Nov 11 14:37:31 doughnut kernel: exiting task "irq/17-b43" (343) is an active 
IRQ thread (irq 17)

Hopefully this means something.
It makes a change from the "b43-phy0 ERROR: Fatal DMA error"

Andy
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-11 Thread Andrew Benton
On 11/11/09 02:58, Larry Finger wrote:
> On 11/10/2009 06:48 PM, Andrew Benton wrote:
>> Nov 11 00:04:37 doughnut kernel: b43-phy0 ERROR: Fatal DMA error:
>> 0x0400, 0x, 0x, 0x, 0x, 0x
>> Nov 11 00:04:37 doughnut kernel: b43-phy0: Controller RESET (DMA error) ...
>
> The 0x0800 indicates a descriptor problem. Why it should happen after 15
> minutes is perplexing. If I write a diagnostic patch, could you test it?
>
> Larry
>

Sure, I'd be glad to be of use.

Andy
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-10 Thread Larry Finger
On 11/10/2009 06:48 PM, Andrew Benton wrote:
> Hello world,
> I have a Dell inspiron 910 with a Broadcom wireless card.
> 
> andy:~$ lspci -vnn | grep 14e4
> 03:00.0 Network controller [0280]: Broadcom Corporation BCM4312 
> 802.11b/g [14e4:4315] (rev 01)
>   Subsystem: Broadcom Corporation Device [14e4:04b5]
> 
> Today I downloaded the current wireless-testing kernel (2.6.32-rc6-wl) 
> and compiled it without the ACPI kernel driver.
> It worked well for about 15 mins but then the network connection was 
> lost and /var/log/sys.log started to fill up with
> 
> Nov 11 00:04:31 doughnut kernel: b43-phy0 ERROR: Fatal DMA error: 
> 0x0800, 0x, 0x, 0x, 0x, 0x
> Nov 11 00:04:31 doughnut kernel: b43-phy0: Controller RESET (DMA error) ...
> Nov 11 00:04:31 doughnut kernel: b43-phy0 ERROR: Fatal DMA error: 
> 0x0800, 0x, 0x, 0x, 0x, 0x
> Nov 11 00:04:31 doughnut kernel: b43-phy0: Controller RESET (DMA error) ...
> Nov 11 00:04:31 doughnut kernel: b43-phy0: Loading firmware version 
> 410.2160 (2007-05-26 15:32:10)
> Nov 11 00:04:37 doughnut kernel: b43-phy0: Controller restarted
> Nov 11 00:04:37 doughnut kernel: b43-phy0 ERROR: Fatal DMA error: 
> 0x0400, 0x, 0x, 0x, 0x, 0x
> Nov 11 00:04:37 doughnut kernel: b43-phy0: Controller RESET (DMA error) ...

The 0x0800 indicates a descriptor problem. Why it should happen after 15
minutes is perplexing. If I write a diagnostic patch, could you test it?

Larry
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


b43-phy0 ERROR: Fatal DMA error: 0x00000400

2009-11-10 Thread Andrew Benton
Hello world,
I have a Dell inspiron 910 with a Broadcom wireless card.

andy:~$ lspci -vnn | grep 14e4
03:00.0 Network controller [0280]: Broadcom Corporation BCM4312 
802.11b/g [14e4:4315] (rev 01)
Subsystem: Broadcom Corporation Device [14e4:04b5]

Today I downloaded the current wireless-testing kernel (2.6.32-rc6-wl) 
and compiled it without the ACPI kernel driver.
It worked well for about 15 mins but then the network connection was 
lost and /var/log/sys.log started to fill up with

Nov 11 00:04:31 doughnut kernel: b43-phy0 ERROR: Fatal DMA error: 
0x0800, 0x, 0x, 0x, 0x, 0x
Nov 11 00:04:31 doughnut kernel: b43-phy0: Controller RESET (DMA error) ...
Nov 11 00:04:31 doughnut kernel: b43-phy0 ERROR: Fatal DMA error: 
0x0800, 0x, 0x, 0x, 0x, 0x
Nov 11 00:04:31 doughnut kernel: b43-phy0: Controller RESET (DMA error) ...
Nov 11 00:04:31 doughnut kernel: b43-phy0: Loading firmware version 
410.2160 (2007-05-26 15:32:10)
Nov 11 00:04:37 doughnut kernel: b43-phy0: Controller restarted
Nov 11 00:04:37 doughnut kernel: b43-phy0 ERROR: Fatal DMA error: 
0x0400, 0x, 0x, 0x, 0x, 0x
Nov 11 00:04:37 doughnut kernel: b43-phy0: Controller RESET (DMA error) ...

..and so on

Andy
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev