On Friday 28 May 2010 10:58:10 Robert Brown wrote:
> It looks like the code I
> added has made the corruption problem much worse.

welcome to the wonderful world of ath5k "heisenbugs"! :)

actually we can be happy that you have something repeatable here. i'm still 
trying to reproduce the same problem. can you give us more information how you 
transfer the data? how do you detect the duplicates?

also i'm wondering: how can it happen that the repeated packet passes the 
upper layers unnoticed? 

> For now, however, here are some of the debugging messages generated by the
> resets I saw while transferring data.  Perhaps the presence of repeated hex
> values in the "rx queue" messages or the info in the "R" receive buffer
> dumps contains a clue about what's going on.

i think, the repeated" rx queue hex" prints should be o.k. since this is just 
the RXDP (current rx descriptor for the hardware to use), and it does not have 
to change if no packet is received. more comments below:

> (ath5k_reset:2955): resetting
> May 27 19:58:59 ugg kernel: [68460.373320] ath5k phy0:
> (ath5k_txq_cleanup:1666): beacon queue 0
> May 27 19:58:59 ugg kernel: [68460.373462] ath5k phy0:
> (ath5k_txq_cleanup:1675): txq [0] 312206e0, link (null)
> May 27 19:58:59 ugg kernel: [68460.373615] rx queue 31220200, link f12201c0
> May 27 19:58:59 ugg kernel: [68460.373631] R (f12201e0 312201e0)
> 31220200 32ac7020 00000000 00000a00 109d8065 109d8065 *
> May 27 19:58:59 ugg kernel: [68460.428534] rx queue 31220220, link f1220200
> May 27 19:58:59 ugg kernel: [68460.484518] rx queue 31220260, link f1220240
> May 27 19:58:59 ugg kernel: [68460.540524] rx queue 31220260, link f1220240
> May 27 19:58:59 ugg kernel: [68460.597526] rx queue 31220280, link f1220260
> May 27 19:58:59 ugg kernel: [68460.653502] rx queue 312202c0, link f12202a0
> May 27 19:58:59 ugg kernel: [68460.709503] rx queue 31220320, link f1220300
> 
> This is interesting.  Repeated packet in the middle of the reset.

very interesting! - there should be no packet reception during a reset!!!

also i am wondering why you have so many resets? (a review of the reset code 
is the next thing we should do... it gives us trouble in other places too...)

i looked at the debug output more. it prints descriptors which are still in 
the RX queue just at the beginning of the reset function.

[68301.828987] rx queue 31220120, link f12200e0
[68301.829002] R (f1220100 31220100) 31220120 2a343020 00000000 00000a00 
12bd8608 12bd8608 *

[68301.888512] rx queue 31220180, link f1220140
[68301.888540] R (f1220160 31220160) 31220180 370f5020 00000000 00000a00 
21bd8608 21bd8608 !

[68307.773523] rx queue 312202a0, link f1220260
[68307.773531] R (f1220280 31220280) 312202a0 370f5020 00000000 00000a00 
11cd8123 11cd8123 *

[68389.913677] rx queue 312201c0, link f1220180
[68389.913688] R (f12201a0 312201a0) 312201c0 32ac4020 00000000 00000a00 
224d8608 224d8608 *

[68389.972482] rx queue 312201e0, link f12201a0
[68389.972493] R (f12201c0 312201c0) 312201e0 32ac7020 00000000 00000a00 
22ad8075 22ad8075 *

[68390.308460] rx queue 312202c0, link f1220280
[68390.308476] R (f12202a0 312202a0) 312202c0 32a8f020 00000000 00000a00 
10ad805c 10ad805c *

[68460.373615] rx queue 31220200, link f12201c0
[68460.373631] R (f12201e0 312201e0) 31220200 32ac7020 00000000 00000a00 
109d8065 109d8065 *

the fields in the R... line are in this order:

ds,
(unsigned long long)bf->daddr,
ds->ds_link,
ds->ds_data,
rd->rx_ctl.rx_control_0, 
rd->rx_ctl.rx_control_1,
rd->u.rx_stat.rx_status_0,
rd->u.rx_stat.rx_status_0,
!done ? ' ' : (rs->rs_status == 0) ? '*' : '!');

i'm not sure if it's relevant or not, but if we look at the adresses in the 
last 5 lines, they are all somehow related...

none of these descriptors was a JUMBO b.t.w.

bruno
_______________________________________________
ath5k-devel mailing list
ath5k-devel@lists.ath5k.org
https://lists.ath5k.org/mailman/listinfo/ath5k-devel

Reply via email to