Re: [ath9k-devel] [RFC] ath9k: Detect and work-around tx-queue hang.
On 02/22/2013 04:55 AM, Ben Greear wrote: On 02/22/2013 04:38 AM, Felix Fietkau wrote: On 2013-02-22 1:25 PM, Sujith Manoharan wrote: Felix Fietkau wrote: Please also check if the station(s) that the frames are queued for are in powersave state for some reason. That would prevent the tx path from throwing them in the hw queue, yet they'd still take up pending-frame slots. I was planning on fixing this eventually by expiring frames that stay in the queue for too long, but haven't decided on the exact approach yet. PS is disabled for multi-VIF. What about off-channel PS due to scans, etc. Scan is always locked to the same channel in this setup (once a single station is associated). The stations stay associated while this problem happens (the high-priority queue seems to work just fine, which may be the reason they stay associated just fine.) In some cases, I see packets delivered around 30 seconds late... aside from PS and off-channel..any idea what could make a packet stick around that long in the tx queues? One of our customers on a 3.5.7+ kernel hit the problem without using any RF attenuator...just over-the-air communication to their AP. It happened on both 2.4 and 5Ghz bands. Seems rx signal is around 40 in their environment. It took them around 24 hours to hit the problem on average. Last we checked, we could fairly easily reproduce this in our lab using an attenuator and a certain setup, so if there is any debugging we could add to help narrow down what might be causing this, we can give that a try. For instance, is there any good way to know for certain if packets in the queue are in power-save or not? I know we at least attempt to disable power-save, but possibly it gets re-enabled somehow? Thanks, Ben -- Ben Greear gree...@candelatech.com Candela Technologies Inc http://www.candelatech.com ___ ath9k-devel mailing list ath9k-devel@lists.ath9k.org https://lists.ath9k.org/mailman/listinfo/ath9k-devel
Re: [ath9k-devel] Questions in 802.11 n timestamps traces for aggregated frames ?
Hi Felix, Adrian, /* This packet was aggregated but doesn't carry status info */ if ((info-flags IEEE80211_TX_CTL_AMPDU) !(info-flags IEEE80211_TX_STAT_AMPDU)) return; Any packet with IEEE80211_TX_CTL_AMPDU and not IEEE80211_TX_STAT_AMPDU does not carry valid rate/retry information. Did you filter your debug stuff accordingly? I thought* rate_control_tx_status()[hook to minstrel_rate_ht]* in * ieee80211_tx_status()* will return back to driver (using the condition you told above) without reaching the radiotap code, but I don' t know why it does not. I continued and printed the flag, and have found the following : I added another flag(first entry) in radiotap header to indicate 1 if IEEE80211_TX_CTL_AMPDU is set and IEEE80211_TX_STAT_AMPDU is not (the above condition) eg. [1,0,0,0,1] first is above cond, last flag is for aggregation. The driver code indicates, this will be true if the frame passes through ath_tx_rc_status() [which is only twice]. I had noticed that if I do this, I won't probably get the timestamp of rest of the frames given back to mac80211. device,tsf , tx_flags, retx_count,succ rate, attempted bitrates, seq no,(queue+airtime)delay,dumb, frame size 64:a7:69:53:0b:15 159787172 [0, 0, 0, 0, 1] 0 52.0 [] 586 1233 2 1462 64:a7:69:53:0b:15 159787172 [1, 0, 0, 0, 1] 14 52.0 [[52.0, 4], [39.0, 5], [52.0, 5]] 587 2697 2 1462 64:a7:69:53:0b:15 159848530 [0, 0, 0, 0, 0] 0 52.0 [] 588 1436 2 1462 64:a7:69:53:0b:15 159848892 [0, 0, 0, 0, 0] 0 52.0 [] 589 1722 2 1462 64:a7:69:53:0b:15 159849272 [0, 0, 0, 0, 0] 0 52.0 [] 590 1010 2 1462 64:a7:69:53:0b:15 159849931 [0, 0, 0, 0, 0] 0 58.5 [] 591 1602 2 1462 64:a7:69:53:0b:15 159850511 [0, 0, 0, 0, 0] 0 52.0 [] 592 1366 2 1462 64:a7:69:53:0b:15 159850864 [0, 0, 0, 0, 0] 0 52.0 [] 593 1631 2 1462 64:a7:69:53:0b:15 159851253 *[0, 0, 0, 0, 1] 0 52.0 []* 594 2796 2 1462 64:a7:69:53:0b:15 159851253 *[1, 0, 0, 0, 1] 14 52.0 [[52.0, 4], [39.0, 5], [52.0, 5]]* *595* 1016 2 1462 64:a7:69:53:0b:15 159851253 [*1*, 0, 0, 0, 1] 14 52.0 [[52.0, 4], [39.0, 5], [52.0, 5]] *597* 2578 2 1462 64:a7:69:53:0b:15 159851253 [*1*, 0, 0, 0, 1] 14 52.0 [[52.0, 4], [39.0, 5], [52.0, 5]] *598* 1623 2 1462 64:a7:69:53:0b:15 159851253 [*1*, 0, 0, 0, 1] 14 52.0 [[52.0, 4], [39.0, 5], [52.0, 5]] *599* 2312 2 1462 - all invalid !? 64:a7:69:53:0b:15 159851253 [*1*, 0, 0, 0, 1] 14 52.0 [[52.0, 4], [39.0, 5], [52.0, 5]]* 600* 182 2 1462 64:a7:69:53:0b:15 159851253 [*1*, 0, 0, 0, 1] 14 52.0 [[52.0, 4], [39.0, 5], [52.0, 5]]* 601* 2102 2 1462 64:a7:69:53:0b:15 159851253 [*1*, 0, 0, 0, 1] 14 52.0 [[52.0, 4], [39.0, 5], [52.0, 5]]* 602* 6424 2 1462 64:a7:69:53:0b:15 159851253 [*1*, 0, 0, 0, 1] 14 52.0 [[52.0, 4], [39.0, 5], [52.0, 5]]* 603* 513 2 1462 64:a7:69:53:0b:15 159899519 [0, 0, 0, 0, 0] 0 52.0 [] 604 24569 2 1462 64:a7:69:53:0b:15 159903297 [0, 0, 0, 0, 0] 0 52.0 [] 605 28311 2 1462 *64:a7:69:53:0b:15 159937368 [0, 0, 0, 0, 1] 0 52.0 [] 606 38496 2 1462* *64:a7:69:53:0b:15 159937368 [1, 0, 0, 0, 1] 14 52.0 [[52.0, 4], [39.0, 5], [52.0, 5]] 607 3645 2 1462* 64:a7:69:53:0b:15 159937368 [1, 0, 0, 0, 1] 14 52.0 [[52.0, 4], [39.0, 5], [52.0, 5]] 608 393 2 1462 64:a7:69:53:0b:15 159937368 [1, 0, 0, 0, 1] 14 52.0 [[52.0, 4], [39.0, 5], [52.0, 5]] 609 809 2 1462 64:a7:69:53:0b:15 159954617 [0, 0, 0, 0, 0] 0 65.0 [] 610 51231 2 1462 *64:a7:69:53:0b:15 159956668 [0, 0, 0, 0, 1] 0 52.0 [] 611 55867 2 1462* *64:a7:69:53:0b:15 159956668 [1, 0, 0, 0, 1] 14 52.0 [[52.0, 4], [39.0, 5], [52.0, 5]] 612 4501 2 1462* 64:a7:69:53:0b:15 159956668 [1, 0, 0, 0, 1] 14 52.0 [[52.0, 4], [39.0, 5], [52.0, 5]] 613 79511 2 1462 64:a7:69:53:0b:15 159956668 [1, 0, 0, 0, 1] 14 52.0 [[52.0, 4], [39.0, 5], [52.0, 5]] 614 6829 2 1462 64:a7:69:53:0b:15 159956668 [1, 0, 0, 0, 1] 14 52.0 [[52.0, 4], [39.0, 5], [52.0, 5]] 615 830 2 1462 64:a7:69:53:0b:15 159956668 [1, 0, 0, 0, 1] 14 52.0 [[52.0, 4], [39.0, 5], [52.0, 5]] 616 1129 2 1462 This suggests that first frame (594,606,611) are valid and aggregated but the following frames are invalid. I will be very grateful to you if you can please tell how to interpret the frames which have the *aggregation *bit(last flag in tx_status flags ) *set* but *do not carry status info *and are delivered to monitor interface. The* sequence number *are* increasing* and never repeating ever again in trace If I remove these frames, it looks like there was no aggregation (I guess aggr of 1462 bytes ?), but why are those seq. numbers never re-used ? - Abhinav ___ ath9k-devel mailing list ath9k-devel@lists.ath9k.org https://lists.ath9k.org/mailman/listinfo/ath9k-devel
Re: [ath9k-devel] Questions in 802.11 n timestamps traces for aggregated frames ?
On 2013-03-13 12:48 AM, abhinav narain wrote: Hi Felix, Adrian, /* This packet was aggregated but doesn't carry status info */ if ((info-flags IEEE80211_TX_CTL_AMPDU) !(info-flags IEEE80211_TX_STAT_AMPDU)) return; Any packet with IEEE80211_TX_CTL_AMPDU and not IEEE80211_TX_STAT_AMPDU does not carry valid rate/retry information. Did you filter your debug stuff accordingly? I thought* rate_control_tx_status()[hook to minstrel_rate_ht]* in *ieee80211_tx_status()* will return back to driver (using the condition you told above) without reaching the radiotap code, but I don' t know why it does not. I continued and printed the flag, and have found the following : I added another flag(first entry) in radiotap header to indicate 1 if IEEE80211_TX_CTL_AMPDU is set and IEEE80211_TX_STAT_AMPDU is not (the above condition) eg. [1,0,0,0,1] first is above cond, last flag is for aggregation. The driver code indicates, this will be true if the frame passes through ath_tx_rc_status() [which is only twice]. I had noticed that if I do this, I won't probably get the timestamp of rest of the frames given back to mac80211. [...] This suggests that first frame (594,606,611) are valid and aggregated but the following frames are invalid. I will be very grateful to you if you can please tell how to interpret the frames which have the *aggregation *bit(last flag in tx_status flags ) *set* but *do not carry status info *and are delivered to monitor interface. The*sequence number *are* increasing* and never repeating ever again in trace If I remove these frames, it looks like there was no aggregation (I guess aggr of 1462 bytes ?), but why are those seq. numbers never re-used ? Your traces don't give you the full picture, because looking at this as a per-frame thing is wrong. If both IEEE80211_TX_CTL_AMPDU and IEEE80211_TX_STAT_AMPDU are set, it means that the contained rate retry information and time stamp refer to the whole A-MPDU, not just that one frame. Information about software retry of frames is not passed on to mac80211 at the moment. If you want accurate timestamp, rate and retry information, do it per A-MPDU, not per MPDU. - Felix ___ ath9k-devel mailing list ath9k-devel@lists.ath9k.org https://lists.ath9k.org/mailman/listinfo/ath9k-devel
Re: [ath9k-devel] Questions in 802.11 n timestamps traces for aggregated frames ?
You need to buffer those frames in your trace right up until you get the final frame bit which tells you that hey, this has some status associated with it. It's arguable whether that's the correct thing to do or not. The better (but slower) way would be to stamp each sub-frame with the completion status information of the status descriptor, including the rate control information. Adrian ___ ath9k-devel mailing list ath9k-devel@lists.ath9k.org https://lists.ath9k.org/mailman/listinfo/ath9k-devel
Re: [ath9k-devel] Questions in 802.11 n timestamps traces for aggregated frames ?
On Tue, Mar 12, 2013 at 7:59 PM, Adrian Chadd adr...@freebsd.org wrote: You need to buffer those frames in your trace right up until you get the final frame bit which tells you that hey, this has some status associated with it. This all the driver+mac is giving me, I haven't discarded any information in the trace, actually. It's arguable whether that's the correct thing to do or not. Yes, I am not sure myself. The better (but slower) way would be to stamp each sub-frame with the completion status information of the status descriptor, including the rate control information. The trace exactly does that. As the subframes of the aggregate are unfolded in* ath_tx_complete_aggr() *in xmit.c, and given to *ath_tx_complete_buf() *, I timestamp them with the * ath_tx_status* descriptor which is present, and get the above trace; turns out the descriptor is invalid ? - Abhinav Adrian ___ ath9k-devel mailing list ath9k-devel@lists.ath9k.org https://lists.ath9k.org/mailman/listinfo/ath9k-devel