I’ve had good success adding GRO into the mix to reduce CPU usage for WiFi APs 
with complex forwarding rules. The recent Qualcomm chipsets and proprietary 
drivers support some parts of GRO and TSO in firmware which helps. The kernel’s 
software GSO function doesn’t work well for this purpose - it reallocates new 
skbs for headers for all the packets it GSOs - but if you reused the existing 
space in the skbs for headers, even software GSO could probably be quite fast.

I’ve yet to explore GRO by aggregate but it feels like a natural fit since the 
wireless packets arrive in aggregates of up to a few 100 frames, to GRO them 
together into one.

Simon


> On Aug 9, 2020, at 2:35 PM, Jonathan Morton <chromati...@gmail.com> wrote:
> 
>> Are the risks and tradeoffs well enough understood (and visible enough 
>> for troubleshooting) to recommend broader deployment?
>> 
>> I recently gave openwrt a try on some hardware that I ultimately 
>> concluded was insufficient for the job.  Fairly soon after changing out 
>> my access point, I started getting complaints of Wi-Fi dropping in my 
>> household, especially when someone was trying to videoconference.  I 
>> discovered that my AP was spontaneously rebooting, and the box was 
>> getting hot.
> 
> Most CPE devices these days rely on hardware accelerated packet forwarding to 
> achieve their published specs.  That's all about taking packets in one side 
> and pushing them out the other as quickly as possible, with only minimal 
> support from the CPU (likely, new connections get a NAT/firewall lookup, 
> that's all).  It has the advantages of speed and power efficiency, but 
> unfortunately it is also incompatible with our debloating efforts.  So 
> debloated CPE will tend to run hotter and with lower peak throughput, which 
> may be noticeable to cable and fibre users; VDSL (FTTC) users might have 
> service of 80Mbps or less where this effect is less likely to matter.
> 
> It sounds like that AP had a very marginal thermal design which caused the 
> hardware to overheat as soon as the CPU was under significant load, which it 
> can easily be when a shaper and AQM are running on it at high throughput.  
> The cure is to use better designed hardware, though you could also 
> contemplate breaking the case open to cure the thermal problem directly.  
> There are some known reliable models which could be collected into a list.  
> As a rule of thumb, the ones based on ARM cores are likely to be designed 
> with CPU performance more in mind than those with MIPS.
> 
> Cake has some features which can be used to support explicit classification 
> and (de)prioritisation of traffic via firewall marking rules, either by 
> rewriting the Diffserv field or by associating metadata with packets within 
> the network stack (fwmark).  This can be very useful for pushing Bittorrent 
> or WinUpdate swarm traffic out of the way.  But for most situations, the 
> default flow-isolating behaviour already works pretty well, especially for 
> ensuring that one computer's network load has only a bounded effect on any 
> other.  We can discuss that in more detail if that would be helpful.
> _______________________________________________
> Bloat mailing list
> Bloat@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/bloat

_______________________________________________
Bloat mailing list
Bloat@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/bloat

Reply via email to