On Sun, Sep 2, 2012 at 11:08 AM, Dave Taht <[email protected]> wrote:
In reviewing this mail I realized I used three different names for tcp_limit_output_bytes, corrected below... > Codel will push stuff down to, but not below, 5ms of latency (or > target). In fq_codel you will typically end up with 1 packet outstanding in > each active queue under heavy load. At 10Mbit it's pretty easy to > have it strain mightily and fail to get to 5ms, particularly on torrent-like > workloads. > > The "right" amount of host latency to aim for is ... 0, or as close to it as > you can get. Fiddling with codel target and interval on the host to > get less host latency is well and good, but you can't get to 0 that way... > > The best queue on a host is no extra queue. > > I spent some time evaluating linux fq_codel vs the ns2 nfq_codel version I > just got working. In 150 bidirectional competing streams, at 100Mbit, > it retained about 30% less packets in queue (110 vs 140). Next up > on my list is longer RTTs and wifi, but all else was pretty equivalent. > > The effects of fiddling with /proc/sys/net/ipv4/tcp_limit_output_bytes > was even more remarkable. At 6000, I would get down to > a nice steady 71-81 packets in queue on that 150 stream workload. > > So, I started thinking through and playing with how TSQ works: > > At one hop 100Mbit, with a BQL of 3000 and a tcp_limit_output_bytes of 6000, > all offloads off, nfq_codel on both ends, I get single stream throughoutput > of 92.85Mbit. Backlog in qdisc is, 0. > > 2 netperf streams, bidirectional: 91.47 each, darn close to theoretical, less > than one packet in the backlog. > > 4 streams backlogs a little over 3. (and sums to 91.94 in each direction) > > 8, backlog of 8. (optimal throughput) > > Repeating the 8 stream test with tcp_limit_output_bytes of 1500, I get > packets outstanding of around 3, and optimal throughput. (1 stream test: > 42Mbit throughput (obviously starved), 150 streams: 82...) > > 8 streams, limit set to 127k, I get 50 packets outstanding in the queue, > and the same throughput. (150 streams, ~100) > > So I might argue that a more "right" number for tcp_limit_output_bytes is > not 128k per TCP socket, but (BQL_limit*2/active_sockets), in conjunction > with fq_codel. I realize that that raises interesting questions as to when > to use TSO/GSO, and how to schedule tcp packet releases, and pushes > the window reduction issue all the way up into the tcp stack rather > than responding to indications from the qdisc... but it does > get you closer to a 0 backlog in qdisc. > > And *usually* the bottleneck link is not on the host but on something > inbetween, and that's where your signalling comes from, anyway. > > > -- > Dave Täht > http://www.bufferbloat.net/projects/cerowrt/wiki - "3.3.8-17 is out > with fq_codel!" -- Dave Täht http://www.bufferbloat.net/projects/cerowrt/wiki - "3.3.8-17 is out with fq_codel!" _______________________________________________ Codel mailing list [email protected] https://lists.bufferbloat.net/listinfo/codel
