On 01/08/2011 04:20 PM, Felix Fietkau wrote: > On 2011-01-08 8:33 AM, gree...@candelatech.com wrote: >> From: Ben Greear<gree...@candelatech.com> >> >> This saves us constantly allocating large, multi-page >> skbs. It should fix the order-1 allocation errors reported, >> and in a 60-vif scenario, this significantly decreases CPU >> utilization, and latency, and increases bandwidth. >> >> Signed-off-by: Ben Greear<gree...@candelatech.com> >> --- >> :100644 100644 b2497b8... ea2f67c... M drivers/net/wireless/ath/ath9k/recv.c >> drivers/net/wireless/ath/ath9k/recv.c | 92 ++++++++++++++++++++++----------- >> 1 files changed, 61 insertions(+), 31 deletions(-) >> >> diff --git a/drivers/net/wireless/ath/ath9k/recv.c >> b/drivers/net/wireless/ath/ath9k/recv.c >> index b2497b8..ea2f67c 100644 >> --- a/drivers/net/wireless/ath/ath9k/recv.c >> +++ b/drivers/net/wireless/ath/ath9k/recv.c >> @@ -1702,42 +1704,70 @@ int ath_rx_tasklet(struct ath_softc *sc, int flush, >> bool hp) >> unlikely(tsf_lower - rs.rs_tstamp> 0x10000000)) >> rxs->mactime += 0x100000000ULL; >> >> - /* Ensure we always have an skb to requeue once we are done >> - * processing the current buffer's skb */ >> - requeue_skb = ath_rxbuf_alloc(common, common->rx_bufsize, GFP_ATOMIC); >> - >> - /* If there is no memory we ignore the current RX'd frame, >> - * tell hardware it can give us a new frame using the old >> - * skb and put it at the tail of the sc->rx.rxbuf list for >> - * processing. */ >> - if (!requeue_skb) >> - goto requeue; >> - >> - /* Unmap the frame */ >> - dma_unmap_single(sc->dev, bf->bf_buf_addr, >> - common->rx_bufsize, >> - dma_type); >> + len = rs.rs_datalen + ah->caps.rx_status_len; >> + if (use_copybreak) { >> + skb = netdev_alloc_skb(NULL, len); >> + if (!skb) { >> + skb = bf->bf_mpdu; >> + use_copybreak = false; >> + goto non_copybreak; >> + } >> + } else { > I think this should be dependent on packet size, maybe even based on the > architecture. Especially on embedded hardware, copying large frames is > probably quite a > bit more expensive than allocating large buffers. Cache sizes are small, > memory access takes several cycles, especially during concurrent DMA. > Once I'm back home, I could try a few packet size threshold to find a sweet > spot for the typical MIPS hardware that I'm playing with. I expect a visible > performance regression from this patch when applied as-is.
I see a serious performance improvement with this patch. My current test is sending 1024 byte UDP payloads to/from each of 60 stations at 128kbps. Please do try it out on your system and see how it performs there. I'm guessing that any time you have more than 1 VIF this will be a good improvement since mac80211 does skb_copy (and you would typically be copying a much smaller packet with this patch). If we do see performance differences on different platforms, this could perhaps be something we could tune at run-time. Thanks, Ben > > - Felix -- Ben Greear <gree...@candelatech.com> Candela Technologies Inc http://www.candelatech.com _______________________________________________ ath9k-devel mailing list ath9k-devel@lists.ath9k.org https://lists.ath9k.org/mailman/listinfo/ath9k-devel