Bob Copeland a écrit :
> On Sun, Mar 14, 2010 at 08:43:02PM +0200, Maxim Levitsky wrote:
>   
>> One thing I noticed is racy behaviour of ath5k_txbuf_setup.
>>
>> ..
>>
>>      spin_lock_bh(&txq->lock);
>>      list_add_tail(&bf->list, &txq->q);
>>      if (txq->link == NULL) /* is this first packet? */
>>              ath5k_hw_set_txdp(ah, txq->qnum, bf->daddr);
>>      else /* no, so only link it */
>>              *txq->link = bf->daddr;
>>
>> ..
>>     
>
> As an aside (even though it probably isn't what is going on here
> given it's a PC) -- my understanding is that this kind of pattern
> needs a barrier because the write to the descriptor in host memory
> can be reordered with respect to the CR write in mmio space.
>
> You may try sticking an mmiowb() after we update the link and see
> if the crash goes away.
>
>   
This remind me a very similar bug in madwifi. The problem was a race 
condition between the host CPU and the radio card itself since 
AR5K_INT_TXOK is triggered before ds_link is read by the hardware. What 
could happens is that the said ds_link could be reset to 0 by the 
software before the hardware actually read it. In this case, when the 
hardware read it and found 0, then, it thinks the TX queue has reached 
its end and stops transmitting. It's "easy" to diagnose in this case 
since TXDP is NULL.

One solution is to let the TX descriptor in the TX queue if the current 
value of TXDP is currently pointing to it.

If I am correct, debugging such problem is tricky since it's highly time 
sensitive, so adding printk can make it disappear. Here is the code I 
used for debugging : 
http://madwifi-project.org/browser/madwifi/trunk/ath/if_ath.c#L3040

If you can confirm that TXDP is indeed NULL while the TX queue is not 
empty, then we can confirm it is the same bug.

Regards,
Benoit

_______________________________________________
ath5k-devel mailing list
ath5k-devel@lists.ath5k.org
https://lists.ath5k.org/mailman/listinfo/ath5k-devel

Reply via email to