On 01/09/2011 10:13 AM, Jouni Malinen wrote:
> On Sat, Jan 08, 2011 at 04:36:23PM -0800, Ben Greear wrote:
>> On 01/08/2011 04:20 PM, Felix Fietkau wrote:
>>> On 2011-01-08 8:33 AM, gree...@candelatech.com wrote:
>>>> From: Ben Greear<gree...@candelatech.com>
>>>> This saves us constantly allocating large, multi-page
>>>> skbs. It should fix the order-1 allocation errors reported,
>>>> and in a 60-vif scenario, this significantly decreases CPU
>>>> utilization, and latency, and increases bandwidth.
>
> As far as CPU use is concerned, 60 VIF scenario should not be the one to
> use for checking what is most efficient.. This really needs to be tested
> on something that uses a single VIF on an embedded (low-power CPU)..
>
> For the order-1 allocation issues, it would be interesting to see if
> someone could take a look at using paged skbs or multiple RX descriptors
> with shorter skbs (and copying only for the case where a long frame is
> received so that only the A-MSDU RX case would suffer from extra
> copying).

>
>> I see a serious performance improvement with this patch.  My current test is 
>> sending 1024 byte UDP
>> payloads to/from each of 60 stations at 128kbps.  Please do try it out on 
>> your system and see how
>> it performs there.  I'm guessing that any time you have more than 1 VIF this 
>> will be a good
>> improvement since mac80211 does skb_copy (and you would typically be copying 
>> a much smaller
>> packet with this patch).
>
> How would this patch change the number of bytes copied by skb_copy?

It seems that if you allocate a 2-page SKB, as upstream ath9k does, pass that
up the stack, then if/when anything calls 'skb_copy' it allocates a new skb with
2 pages even if the actual data-length is much smaller.

This copy wouldn't be so bad for single VIF scenarios (which means probably no 
copying),
but you still end up exhausting the order-1 memory buffer pool with lots of big 
skbs
floating around the system.  Note that the original bug was not filed by me
and happened on some embedded device, though I also see memory exhaustion in my
tests with upstream code.

>
>> If we do see performance differences on different platforms, this could 
>> perhaps be
>> something we could tune at run-time.
>
> I guess that could be looked at, but as long as that is not the case,
> the test setup you used is not exactly the most common case for ath9k in
> the upstream kernel and should not be used to figure out default
> behavior.

True, but I also like the protection this should offer against stray
DMA that this chipset/driver seems capable of.

I'm curious if anyone has any stats at all as far as ath9k performance goes?

Thanks,
Ben

-- 
Ben Greear <gree...@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com
_______________________________________________
ath9k-devel mailing list
ath9k-devel@lists.ath9k.org
https://lists.ath9k.org/mailman/listinfo/ath9k-devel

Reply via email to