today I finally had a chance to benchmark xmit_more enablement on the archer c7v2.
I enabled xmit_more support in the driver, and also modified the NAPI settings to only fire every 32 packets instead of 64. (I had done a prior run on the wndr with NAPI setting of 64 that was worse) More netperf-wrapper data than you can shake a stick at here: http://snapon.lab.bufferbloat.net/~d/archer_xmit_more_tests.tgz Previous test runs for the wndr3800 here: http://snapon.lab.bufferbloat.net/~d/xmit_more_3800.tgz A) on rrul it was a slight lose, in an odd way. Where with the default settings, I would see packets queue up in the cake3 qdisc and be smartly managed there, with that setting I saw nothing queue up in the cake3 qdiscs, less throughput, and more latency. http://snapon.lab.bufferbloat.net/~d/xmit_more_comparison_archer.png [1] My guess here was that this change to the overall flow balance inside the router pushed all drops to the rx ring. B) but on the tcp_upload test xmit_more was quite obviously a lose: http://snapon.lab.bufferbloat.net/~d/xmit_more_tcp_upload.svg ... So in summary I think that xmit_more is of value when the cpu is sufficiently fast to stuff in and get bits at line rate, but not of value on these weak embedded cpus, where the shorter the code path, the better off you are. In my next build I will try disabling it entirely. I am open to other suggestions. C) But - per the graph above - also hostapd running all the time as it presently does for no sane reason I can discern, costs 80mbits of throughput, and cake3 has quite a ways to go until it is as fast as fq_codel. (NOT planning on worrying about speeding it up for a while! Am happy to keep it targetted at being faster than HTB at non-line rates, make it more correct, etc, etc. - line rate is not the target... stay calm... no premature optimizations please...) cake3 looks quite good vs htb+fq_codel, there is some netperf-wrapper data in there on that, also. D) It was comforting to see ipv6 forwarding rates actually slightly higher than natted IPv4 on these tests, with the unaligned access hacks turned off, and compiled for this specific processor. [1] I also fiddled with reducing the tx ring to 8, with "interesting" results. -- Dave Täht Let's make wifi fast, less jittery and reliable again! https://plus.google.com/u/0/107942175615993706558/posts/TVX3o84jjmb _______________________________________________ Cerowrt-devel mailing list [email protected] https://lists.bufferbloat.net/listinfo/cerowrt-devel
