On Mon, Jan 9, 2023 at 12:46 PM rjmcmahon <rjmcma...@rjmcmahon.com> wrote: > > The write to read latencies (OWD) are on the server side in CLT form. > Use --histograms on the server side to enable them.
Thx. It is far more difficult to instrument things on the server side of the testbed but we will tackle it. > Your client side sampled TCP RTT is 6ms with less than a 1 ms of > variance (or sqrt of variance as variance is typically squared) No > retries suggest the network isn't dropping packets. Thank you for analyzing that result. the cake aqm, set for a 5ms target, with RFC3168-style ECN, is enabled on this path, on this setup, at the moment. So the result is correct. A second test with ecn off showed the expected retries. I have emulations also of fifos, pie, fq-pie, fq-codel, red, blue, sfq, with various realworld delays, and so on... but this is a bit distracting at the moment from our focus, which was in optimizing the XDP + ebpf based bridge and epping based sampling tools to crack 25Gbit. I think iperf2 will be great for us after that settles down. > All the newer bounceback code is only master and requires a compile from > source. It will be released in 2.1.9 after testing cycles. Hopefully, in > early March 2023 I would like to somehow parse and present those histograms. > > Bob > > https://sourceforge.net/projects/iperf2/ > > > The DC that so graciously loaned us 3 machines for the testbed (thx > > equinix!), does support ptp, but we have not configured it yet. In ntp > > tests between these hosts we seem to be within 500us, and certainly > > 50us would be great, in the future. > > > > I note that in all my kvetching about the new tests' needing > > validation today... I kind of elided that I'm pretty happy with > > iperf2's new tests that landed last august, and are now appearing in > > linux package managers around the world. I hope more folk use them. > > (sorry robert, it's been a long time since last august!) > > > > Our new testbed has multiple setups. In one setup - basically the > > machine name is equal to a given ISP plan, and a key testing point is > > looking at the differences between the FCC 25-3 and 100/20 plans in > > the real world. However at our scale (25gbit) it turned out that > > emulating the delay realistically has problematic. > > > > Anyway, here's a 25/3 result for iperf (other results and iperf test > > type requests gladly accepted) > > > > root@lqos:~# iperf -6 --trip-times -c c25-3 -e -i 1 > > ------------------------------------------------------------ > > Client connecting to c25-3, TCP port 5001 with pid 2146556 (1 flows) > > Write buffer size: 131072 Byte > > TOS set to 0x0 (Nagle on) > > TCP window size: 85.3 KByte (default) > > ------------------------------------------------------------ > > [ 1] local fd77::3%bond0.4 port 59396 connected with fd77::1:2 port > > 5001 (trip-times) (sock=3) (icwnd/mss/irtt=13/1428/948) (ct=1.10 ms) > > on 2023-01-09 20:13:37 (UTC) > > [ ID] Interval Transfer Bandwidth Write/Err Rtry > > Cwnd/RTT(var) NetPwr > > [ 1] 0.0000-1.0000 sec 3.25 MBytes 27.3 Mbits/sec 26/0 0 > > 19K/6066(262) us 562 > > [ 1] 1.0000-2.0000 sec 3.00 MBytes 25.2 Mbits/sec 24/0 0 > > 15K/4671(207) us 673 > > [ 1] 2.0000-3.0000 sec 3.00 MBytes 25.2 Mbits/sec 24/0 0 > > 13K/5538(280) us 568 > > [ 1] 3.0000-4.0000 sec 3.12 MBytes 26.2 Mbits/sec 25/0 0 > > 16K/6244(355) us 525 > > [ 1] 4.0000-5.0000 sec 3.00 MBytes 25.2 Mbits/sec 24/0 0 > > 19K/6152(216) us 511 > > [ 1] 5.0000-6.0000 sec 3.00 MBytes 25.2 Mbits/sec 24/0 0 > > 22K/6764(529) us 465 > > [ 1] 6.0000-7.0000 sec 3.12 MBytes 26.2 Mbits/sec 25/0 0 > > 15K/5918(605) us 554 > > [ 1] 7.0000-8.0000 sec 3.00 MBytes 25.2 Mbits/sec 24/0 0 > > 18K/5178(327) us 608 > > [ 1] 8.0000-9.0000 sec 3.00 MBytes 25.2 Mbits/sec 24/0 0 > > 19K/5758(473) us 546 > > [ 1] 9.0000-10.0000 sec 3.00 MBytes 25.2 Mbits/sec 24/0 0 > > 16K/6141(280) us 512 > > [ 1] 0.0000-10.0952 sec 30.6 MBytes 25.4 Mbits/sec 245/0 > > 0 19K/5924(491) us 537 > > > > > > On Mon, Jan 9, 2023 at 11:13 AM rjmcmahon <rjmcma...@rjmcmahon.com> > > wrote: > >> > >> My biggest barrier is the lack of clock sync by the devices, i.e. very > >> limited support for PTP in data centers and in end devices. This > >> limits > >> the ability to measure one way delays (OWD) and most assume that OWD > >> is > >> 1/2 and RTT which typically is a mistake. We know this intuitively > >> with > >> airplane flight times or even car commute times where the one way time > >> is not 1/2 a round trip time. Google maps & directions provide a time > >> estimate for the one way link. It doesn't compute a round trip and > >> divide by two. > >> > >> For those that can get clock sync working, the iperf 2 --trip-times > >> options is useful. > >> > >> --trip-times > >> enable the measurement of end to end write to read latencies > >> (client > >> and server clocks must be synchronized) > >> > >> Bob > >> > I have many kvetches about the new latency under load tests being > >> > designed and distributed over the past year. I am delighted! that they > >> > are happening, but most really need third party evaluation, and > >> > calibration, and a solid explanation of what network pathologies they > >> > do and don't cover. Also a RED team attitude towards them, as well as > >> > thinking hard about what you are not measuring (operations research). > >> > > >> > I actually rather love the new cloudflare speedtest, because it tests > >> > a single TCP connection, rather than dozens, and at the same time folk > >> > are complaining that it doesn't find the actual "speed!". yet... the > >> > test itself more closely emulates a user experience than speedtest.net > >> > does. I am personally pretty convinced that the fewer numbers of flows > >> > that a web page opens improves the likelihood of a good user > >> > experience, but lack data on it. > >> > > >> > To try to tackle the evaluation and calibration part, I've reached out > >> > to all the new test designers in the hope that we could get together > >> > and produce a report of what each new test is actually doing. I've > >> > tweeted, linked in, emailed, and spammed every measurement list I know > >> > of, and only to some response, please reach out to other test designer > >> > folks and have them join the rpm email list? > >> > > >> > My principal kvetches in the new tests so far are: > >> > > >> > 0) None of the tests last long enough. > >> > > >> > Ideally there should be a mode where they at least run to "time of > >> > first loss", or periodically, just run longer than the > >> > industry-stupid^H^H^H^H^H^Hstandard 20 seconds. There be dragons > >> > there! It's really bad science to optimize the internet for 20 > >> > seconds. It's like optimizing a car, to handle well, for just 20 > >> > seconds. > >> > > >> > 1) Not testing up + down + ping at the same time > >> > > >> > None of the new tests actually test the same thing that the infamous > >> > rrul test does - all the others still test up, then down, and ping. It > >> > was/remains my hope that the simpler parts of the flent test suite - > >> > such as the tcp_up_squarewave tests, the rrul test, and the rtt_fair > >> > tests would provide calibration to the test designers. > >> > > >> > we've got zillions of flent results in the archive published here: > >> > https://blog.cerowrt.org/post/found_in_flent/ > >> > ps. Misinformation about iperf 2 impacts my ability to do this. > >> > >> > The new tests have all added up + ping and down + ping, but not up + > >> > down + ping. Why?? > >> > > >> > The behaviors of what happens in that case are really non-intuitive, I > >> > know, but... it's just one more phase to add to any one of those new > >> > tests. I'd be deliriously happy if someone(s) new to the field > >> > started doing that, even optionally, and boggled at how it defeated > >> > their assumptions. > >> > > >> > Among other things that would show... > >> > > >> > It's the home router industry's dirty secret than darn few "gigabit" > >> > home routers can actually forward in both directions at a gigabit. I'd > >> > like to smash that perception thoroughly, but given our starting point > >> > is a gigabit router was a "gigabit switch" - and historically been > >> > something that couldn't even forward at 200Mbit - we have a long way > >> > to go there. > >> > > >> > Only in the past year have non-x86 home routers appeared that could > >> > actually do a gbit in both directions. > >> > > >> > 2) Few are actually testing within-stream latency > >> > > >> > Apple's rpm project is making a stab in that direction. It looks > >> > highly likely, that with a little more work, crusader and > >> > go-responsiveness can finally start sampling the tcp RTT, loss and > >> > markings, more directly. As for the rest... sampling TCP_INFO on > >> > windows, and Linux, at least, always appeared simple to me, but I'm > >> > discovering how hard it is by delving deep into the rust behind > >> > crusader. > >> > > >> > the goresponsiveness thing is also IMHO running WAY too many streams > >> > at the same time, I guess motivated by an attempt to have the test > >> > complete quickly? > >> > > >> > B) To try and tackle the validation problem:ps. Misinformation about > >> > iperf 2 impacts my ability to do this. > >> > >> > > >> > In the libreqos.io project we've established a testbed where tests can > >> > be plunked through various ISP plan network emulations. It's here: > >> > https://payne.taht.net (run bandwidth test for what's currently hooked > >> > up) > >> > > >> > We could rather use an AS number and at least a ipv4/24 and ipv6/48 to > >> > leverage with that, so I don't have to nat the various emulations. > >> > (and funding, anyone got funding?) Or, as the code is GPLv2 licensed, > >> > to see more test designers setup a testbed like this to calibrate > >> > their own stuff. > >> > > >> > Presently we're able to test: > >> > flent > >> > netperf > >> > iperf2 > >> > iperf3 > >> > speedtest-cli > >> > crusader > >> > the broadband forum udp based test: > >> > https://github.com/BroadbandForum/obudpst > >> > trexx > >> > > >> > There's also a virtual machine setup that we can remotely drive a web > >> > browser from (but I didn't want to nat the results to the world) to > >> > test other web services. > >> > _______________________________________________ > >> > Rpm mailing list > >> > r...@lists.bufferbloat.net > >> > https://lists.bufferbloat.net/listinfo/rpm -- This song goes out to all the folk that thought Stadia would work: https://www.linkedin.com/posts/dtaht_the-mushroom-song-activity-6981366665607352320-FXtz Dave Täht CEO, TekLibre, LLC _______________________________________________ LibreQoS mailing list LibreQoS@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/libreqos