Re: More data on open-source firmware crash

2009-02-23 Thread Michael Buesch
On Monday 23 February 2009 07:30:13 Larry Finger wrote:
 Francesco Gringoli wrote:
  
  do you mind testing this firmware? It's not the solution, but can help
  us understanding if we should follow this way. Download at
  http://www.ing.unibs.it/~gringoli/fwtest.tar.gz
  
  Before using this firmware please recompile b43 changing these two
  definitions in b43.h
  
  #define B43_MARKER_ID_REG   52
  #define B43_MARKER_LINE_REG 53
  
  I coded the firmware so that it will raise a B43_DEBUGIRQ_MARKER with id
  10, line 100 if the condition I'm thinking to is true. You will see (I
  hope) in dmesg.
 
 I ran the test firmware until there was a failure. The B43_DEBUGIRQ_MARKER was
 not present in the dmesg output.
 
 I also did as Michael suggested and preserved the depth of the tx_status 
 queue,
 as well as the maximum depth observed. The latter value is 16; therefore, it 
 is
 possible that the queue is being overrun. It is certainly full if the limit
 really is 16. I still need to look at the detailed data from the last run, but
 it is late here today.

Note again that I expect the tx status queue is full check at the _start_ of 
the TX
processing in firmware. Because if we started TX and later figure out the 
status queue
is full, we can't really do anything about it. So we'd have to check that 
before even
fetching the frame from the DMA.

-- 
Greetings, Michael.
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: More data on open-source firmware crash

2009-02-22 Thread Francesco Gringoli

On Feb 22, 2009, at 8:10 PM, Larry Finger wrote:

 Francesco and Lorenzo,

 I modified my driver source to dump the firmware machine state  
 whenever the
 b43_dma_handle_txstatus routine was called with an out-of-order  
 cookie. With
 proprietary firmware, the test of a flood ping in one job and  
 repeated tcpperf
 transmissions in a second ran for 10 hours without a single  
 failure. With the
 open-source firmware it failed after about 2 hours.

 Below are the saved status data. Listed for each item are the  
 cookie, the
 sequence number, and the skb length. The 0x84 length values come  
 from the ping.
 All of the out-of-order items come from tcpperf - is it significant  
 that they
 are from the longer set? Note that a number of cookie/sequence pairs  
 are
 missing, namely: 2064/9C1, 2066/9C2, 2068/9C3, 206A/9C4, 206C/9C5,  
 2072/9C7,
 2076/9C9, and 207A/9CB. Cookie 206E is missing, but the next  
 sequence (9C6) was
 attached to cookie 2070.


Larry,

do you mind testing this firmware? It's not the solution, but can help  
us understanding if we should follow this way. Download at 
http://www.ing.unibs.it/~gringoli/fwtest.tar.gz

Before using this firmware please recompile b43 changing these two  
definitions in b43.h

#define B43_MARKER_ID_REG   52
#define B43_MARKER_LINE_REG 53

I coded the firmware so that it will raise a B43_DEBUGIRQ_MARKER with  
id 10, line 100 if the condition I'm thinking to is true. You will see  
(I hope) in dmesg.

Thanks,
bye
-FG
___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev


Re: More data on open-source firmware crash

2009-02-22 Thread Larry Finger
Francesco Gringoli wrote:
 
 do you mind testing this firmware? It's not the solution, but can help
 us understanding if we should follow this way. Download at
 http://www.ing.unibs.it/~gringoli/fwtest.tar.gz
 
 Before using this firmware please recompile b43 changing these two
 definitions in b43.h
 
 #define B43_MARKER_ID_REG   52
 #define B43_MARKER_LINE_REG 53
 
 I coded the firmware so that it will raise a B43_DEBUGIRQ_MARKER with id
 10, line 100 if the condition I'm thinking to is true. You will see (I
 hope) in dmesg.

I ran the test firmware until there was a failure. The B43_DEBUGIRQ_MARKER was
not present in the dmesg output.

I also did as Michael suggested and preserved the depth of the tx_status queue,
as well as the maximum depth observed. The latter value is 16; therefore, it is
possible that the queue is being overrun. It is certainly full if the limit
really is 16. I still need to look at the detailed data from the last run, but
it is late here today.

Larry

___
Bcm43xx-dev mailing list
Bcm43xx-dev@lists.berlios.de
https://lists.berlios.de/mailman/listinfo/bcm43xx-dev