Re: [ath9k-devel] [Cerowrt-devel] periodic hang of ath9k

2014-07-14 Thread Felix Fietkau
On 2014-07-14 06:25, Sujith Manoharan wrote:
 Dave Taht wrote:
 cc-ing ath9k-devel for this update on http://www.bufferbloat.net/issues/442
 
 this bug, which some people (usually on macs with low signal strength)
 can get to occur fairly rapidly, but I can't, is driving me 9 kinds of
 crazy...
 
 Does stock OpenWrt also have this bug, or is this specific to Cerowrt ?
After receiving some useful debug output from Antonio Quartulli (who
was able to reproduce it easily), I believe that I have tracked down
this issue to some bugs in counting pending tx frames. When frames get
pushed through the U-APSD queue for PS responses and dropped there due
to retransmissions, the counter probably does not get decremented
properly.

I've come up with an untested patch that should fix this codepath
and make it easier to verify.

If you're affected by the bug, please test this patch:
---
--- a/drivers/net/wireless/ath/ath9k/ath9k.h
+++ b/drivers/net/wireless/ath/ath9k/ath9k.h
@@ -185,7 +185,8 @@ struct ath_atx_ac {
 
 struct ath_frame_info {
struct ath_buf *bf;
-   int framelen;
+   u16 framelen;
+   s8 txq;
enum ath9k_key_type keytype;
u8 keyix;
u8 rtscts_rate;
--- a/drivers/net/wireless/ath/ath9k/xmit.c
+++ b/drivers/net/wireless/ath/ath9k/xmit.c
@@ -147,15 +147,13 @@ static void ath_set_rates(struct ieee802
 static void ath_txq_skb_done(struct ath_softc *sc, struct ath_txq *txq,
 struct sk_buff *skb)
 {
-   int q;
-
-   q = skb_get_queue_mapping(skb);
-   if (txq == sc-tx.uapsdq)
-   txq = sc-tx.txq_map[q];
+   struct ath_frame_info *fi = get_frame_info(skb);
+   int q = fi-txq;
 
-   if (txq != sc-tx.txq_map[q])
+   if (q  0)
return;
 
+   txq = sc-tx.txq_map[q];
if (WARN_ON(--txq-pending_frames  0))
txq-pending_frames = 0;
 
@@ -1999,6 +1997,7 @@ static void setup_frame_info(struct ieee
an = (struct ath_node *) sta-drv_priv;
 
memset(fi, 0, sizeof(*fi));
+   fi-txq = -1;
if (hw_key)
fi-keyix = hw_key-hw_key_idx;
else if (an  ieee80211_is_data(hdr-frame_control)  an-ps_key  0)
@@ -2150,6 +2149,7 @@ int ath_tx_start(struct ieee80211_hw *hw
struct ieee80211_tx_info *info = IEEE80211_SKB_CB(skb);
struct ieee80211_sta *sta = txctl-sta;
struct ieee80211_vif *vif = info-control.vif;
+   struct ath_frame_info *fi = get_frame_info(skb);
struct ath_softc *sc = hw-priv;
struct ath_txq *txq = txctl-txq;
struct ath_atx_tid *tid = NULL;
@@ -2170,11 +2170,13 @@ int ath_tx_start(struct ieee80211_hw *hw
q = skb_get_queue_mapping(skb);
 
ath_txq_lock(sc, txq);
-   if (txq == sc-tx.txq_map[q] 
-   ++txq-pending_frames  sc-tx.txq_max_pending[q] 
-   !txq-stopped) {
-   ieee80211_stop_queue(sc-hw, q);
-   txq-stopped = true;
+   if (txq == sc-tx.txq_map[q]) {
+   fi-txq = q;
+   if (++txq-pending_frames  sc-tx.txq_max_pending[q] 
+   !txq-stopped) {
+   ieee80211_stop_queue(sc-hw, q);
+   txq-stopped = true;
+   }
}
 
if (txctl-an  ieee80211_is_data_present(hdr-frame_control))
___
ath9k-devel mailing list
ath9k-devel@lists.ath9k.org
https://lists.ath9k.org/mailman/listinfo/ath9k-devel


Re: [ath9k-devel] [Cerowrt-devel] periodic hang of ath9k

2014-07-14 Thread Stephen Hemminger
I think the stock netgear firmware has similar issues, 2G wireless is
flaky when the Mac's are using it.

___
ath9k-devel mailing list
ath9k-devel@lists.ath9k.org
https://lists.ath9k.org/mailman/listinfo/ath9k-devel


Re: [ath9k-devel] [Cerowrt-devel] periodic hang of ath9k

2014-07-14 Thread Dave Taht
I have little doubt that we have been coping with more than one bug.



On Mon, Jul 14, 2014 at 4:02 PM, R. red...@gmail.com wrote:
 Hello David  list,

 Do you think we could get a build with CONFIG_PACKAGE_ATH_DEBUG
 enabled? I've been following OpenWRT ticket #15320 and it looks like
 this would be the way to go in order to get field logs of WiFi issues
 from end-users.

 In the meantime, here's the latest WiFi failures on my end (running
 CeroWRT 3.10.44-6) -- recovered within two minutes:

 [14378.023437] ath: phy0: Failed to stop TX DMA, queues=0x004!
 [15130.140625] ath: phy0: Failed to stop TX DMA, queues=0x004!
 [15349.164062] ath: phy0: Failed to stop TX DMA, queues=0x004!
 [15349.179687] ath: phy0: DMA failed to stop in 10 ms AR_CR=0x0024
 AR_DIAG_SW=0x4220 DMADBG_7=0x84c0
 [15349.191406] ath: phy0: Could not stop RX, we could be confusing the
 DMA engine when we start RX up
 [16886.886718] ath: phy0: Failed to stop TX DMA, queues=0x004!
 [19839.468750] ath: phy0: Failed to stop TX DMA, queues=0x005!
 [20286.019531] ath: phy0: Failed to stop TX DMA, queues=0x004!
 [20825.996093] ath: phy0: Failed to stop TX DMA, queues=0x005!
 [48749.316406] ath: phy0: Failed to stop TX DMA, queues=0x004!
 [48749.433593] ath: phy0: Failed to stop TX DMA, queues=0x004!
 -
 root@cerowrt:~# cat /sys/kernel/debug/ieee80211/phy0/ath9k/reset
 Baseband Hang:  0
 Baseband Watchdog:  0
Fatal HW Error:  0
   TX HW error:  0
  Transmit timeout:  0
  TX Path Hang:  1
   PLL RX Hang:  0
  MAC Hang: 13
  Stuck Beacon:  7
 MCI Reset:  0
 Calibration error:  1
 -
 Reply from 74.125.225.112: bytes=32 time=39ms TTL=50
 Reply from 74.125.225.112: bytes=32 time=114ms TTL=50
 Reply from 74.125.225.112: bytes=32 time=42ms TTL=50
 Reply from 74.125.225.112: bytes=32 time=37ms TTL=50
 Reply from 74.125.225.112: bytes=32 time=40ms TTL=50
 Reply from 74.125.225.112: bytes=32 time=37ms TTL=50
 Reply from 74.125.225.112: bytes=32 time=1323ms TTL=50
 Reply from 74.125.225.112: bytes=32 time=37ms TTL=50
 Request timed out.
 Request timed out.
 Request timed out.
 Request timed out.
 Request timed out.
 Reply from 74.125.225.112: bytes=32 time=258ms TTL=50
 Request timed out.
 Reply from 74.125.225.112: bytes=32 time=1043ms TTL=50
 Request timed out.
 Reply from 74.125.225.112: bytes=32 time=206ms TTL=50
 Reply from 74.125.225.112: bytes=32 time=91ms TTL=50
 Reply from 74.125.225.112: bytes=32 time=38ms TTL=50
 Reply from 74.125.225.112: bytes=32 time=38ms TTL=50
 Reply from 74.125.225.112: bytes=32 time=40ms TTL=50

 On Mon, Jul 14, 2014 at 2:51 PM, Stephen Hemminger
 step...@networkplumber.org wrote:
 I think the stock netgear firmware has similar issues, 2G wireless is
 flaky when the Mac's are using it.

 ___
 Cerowrt-devel mailing list
 cerowrt-de...@lists.bufferbloat.net
 https://lists.bufferbloat.net/listinfo/cerowrt-devel



-- 
Dave Täht

NSFW: 
https://w2.eff.org/Censorship/Internet_censorship_bills/russell_0296_indecent.article
___
ath9k-devel mailing list
ath9k-devel@lists.ath9k.org
https://lists.ath9k.org/mailman/listinfo/ath9k-devel


Re: [ath9k-devel] [Cerowrt-devel] periodic hang of ath9k

2014-07-13 Thread Sujith Manoharan
Dave Taht wrote:
 cc-ing ath9k-devel for this update on http://www.bufferbloat.net/issues/442
 
 this bug, which some people (usually on macs with low signal strength)
 can get to occur fairly rapidly, but I can't, is driving me 9 kinds of
 crazy...

Does stock OpenWrt also have this bug, or is this specific to Cerowrt ?

Sujith
___
ath9k-devel mailing list
ath9k-devel@lists.ath9k.org
https://lists.ath9k.org/mailman/listinfo/ath9k-devel