Re: [ntp:questions] Detecting bufferbloat via ntp?
dwmal...@maths.tcd.ie (David Malone) writes: Terje Mathisen terje.mathisen at tmsw.no writes: The end points needs at least bandwidth*latency buffers simply to keep the flow going, while routers in between should have very little buffer space, simply because that will allow the end points to discover the real channel capacity much sooner. For traditional TCP (single flow), you need bandwidth*latency as sockbuf at both ends plus the same at the bottleneck router. Some of the new TCP congestion control systems can do with less, and still fill the link if they are the only flow. You might claim that a little intermediate buffer space is a good thing, in that it can allow a short-term burst of packets to get through without having to discard other useful stuff, but only as long as most links have spare capacity most of the time. There was some work a few years ago that suggested that you needed about bandwidth*latency/sqrt(n) buffering at a link with n bottlenecked TCP flows, in order to make sure that the flows could actually use the link. There was also a suggestion that you could get away with less, but that neemed to require a quite large n in practice. Outer bound according to Kleinrock. I think everybody is on the same page here, but at the risk of repeating myself, TCP sockbuf sized (as per the above) is in a different place than software tx queues, dma tx queues, and device specific tx queues. Receive buffers can be large. TX, not so much. The end points needs at least bandwidth*latency buffers simply to keep the flow going, while routers in between should have very little buffer space, simply because that will allow the end points to discover the real channel capacity much sooner Or before a bufferbloated cascade forces a TCP reset. David. -- Dave Taht http://nex-6.taht.net ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
Re: [ntp:questions] Detecting bufferbloat via ntp?
Terje Mathisen terje.mathisen at tmsw.no writes: Rick Jones wrote: Kevin Obermanober...@es.net wrote: No, you probably won't. Both theoretical and empirical information shows that overly large windows are not a good thing. This is the reason all modern network stacks have implemented dynamic window sizing. As far as I know, Linux, MacOS (I think), Windows, and BSD (at least FreeBSD) all do this and do it better then it is possible to do manually. N.B. Windows XP probably does not qualify as modern. Sadly, I see Linux's dynamic window sizing take the window to 4MB when 128KB would do. I'm not familiar with the behaviour of the other stacks' I did a little testing with rick a couple days ago. It turned out his problem was not in his end nodes, but somewhere in his path between his two sites is something rather bloated. I suggested he try tcp vegas or veno as those attempt to deal with buffering in their own ways. Vegas is actually sort of malfunctioning nowadays in that it was designed to cope with sane levels of buffering, not what we are seeing today. Actually finding the most bloated device in the path is something of a hard problem... There's a huge difference between the window sizes at the ends of a link and those employed at the various nodes in between: The end points needs at least bandwidth*latency buffers simply to keep the flow going, while routers in between should have very little buffer space, simply because that will allow the end points to discover the real channel capacity much sooner. Exactly. Yea! You get it. You might claim that a little intermediate buffer space is a good thing, in that it can allow a short-term burst of packets to get through without having to discard other useful stuff, but only as long as most links have spare capacity most of the time. A *little* is just fine. Bloated buffers - containing hundreds, thousands, tens of thousands of packets - which is what we are seeing today - is not. Terje -- Dave Taht http://nex-6.taht.net ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
Re: [ntp:questions] Detecting bufferbloat via ntp?
Rick Jones rick.jon...@hp.com writes: Dave Täht d...@taht.net wrote: Terje Mathisen terje.mathisen at tmsw.no writes: Rick Jones wrote: Kevin Obermanober...@es.net wrote: No, you probably won't. Both theoretical and empirical information shows that overly large windows are not a good thing. This is the reason all modern network stacks have implemented dynamic window sizing. As far as I know, Linux, MacOS (I think), Windows, and BSD (at least FreeBSD) all do this and do it better then it is possible to do manually. N.B. Windows XP probably does not qualify as modern. Sadly, I see Linux's dynamic window sizing take the window to 4MB when 128KB would do. I'm not familiar with the behaviour of the other stacks' I did a little testing with rick a couple days ago. It turned out his problem was not in his end nodes, but somewhere in his path between his two sites is something rather bloated. Or rather, that even after setting the tx queue lengths to 32 packets, a test between that system and one 7ms away still resulted in 4MB socket buffers by the end of the test. Ie confirming that the linux autotuning code was still willing to grow the windows larger than necessary. Better said. The evidence of bloat is not conclusive. Did you give vegas a shot? A *little* is just fine. Bloated buffers - containing hundreds, thousands, tens of thousands of packets - which is what we are seeing today - is not. Well, the BDP of a 10GbE link might actually be measured in thousands of packets or more... if my systems 7 ms apart were joined by a 10 GbE link, that would be a bit more then 5800, 1500 byte packets. I'm thinking that while we may have to configure queues in terms of number of packets, we shouldn't think of them that way, but as length of time. I agree, the dynamic range of todays devices presents a problem. rick jones -- Dave Taht http://nex-6.taht.net ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
Re: [ntp:questions] Detecting bufferbloat via ntp?
Rob nom...@example.com writes: Dave Täht d...@taht.net wrote: Bufferbloat is different that the TCP receive or transmit window size. Bufferbloat is unmanaged buffers in the software TXQUEUE, the hardware TX ring, and the device itself. The TCP layer of buffering is not what we are talking about with bufferbloat. TCP's window and amount of buffering is highly tunable, and controllable. When the users would set their TCP window to a reasonable value, the bufferbloat problem would not exist! When the TCP window is correct for the delay*bandwidth product of a TCP session, there are no packets piling up in buffers halfway, as there is a continous stream of just enough data. Without TCP supplying timely - unbuffered - feedback via ACK and ECN - data will pile up in the unmanaged buffers, period. Unmanaged buffers have the tendency to fool TCP into thinking that your actual distance from site A to site B is several lunar distances. When everyone sets multi-meg windows and does not properly slowstart the connection, the bufferbloat problem kicks in. There are many other potential causes, but yes, that's one. This thread has gotten way off topic, if you'd like to discuss this sort of stuff in more detail, join lists.bufferbloat.net -- Dave Taht http://nex-6.taht.net ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
Re: [ntp:questions] Detecting bufferbloat via ntp?
Terje Mathisen terje.mathisen at tmsw.no writes: Rob wrote: But of course this is completely unreasonable. It will bring you nothing to use a larger window than the rate of your connection times the maximum roundtriptime you get for the sites you download from. And it has Actually, I would suggest a small amount more, as a safety margin, but otherwise I agree. disadvantages to set the window larger than that, which those websites usually don't explain to you. My personal problem is that I am sitting at the end of 30/30 Mbit/s fiber connection here in Oslo, Norway: This means that I can very easily run out of buffer space when talking to US servers: 200 ms ping times corresponds to 6 Mbit of data, i.e. I need about 800 kB network buffers just to keep the pipe going, more if I want to be able to recover from an occasional lost packet with a fast retransmit. Bufferbloat is different that the TCP receive or transmit window size. Bufferbloat is unmanaged buffers in the software TXQUEUE, the hardware TX ring, and the device itself. The TCP layer of buffering is not what we are talking about with bufferbloat. TCP's window and amount of buffering is highly tunable, and controllable. In the real world I simply have to split up my SFTP/SCP/RSYNC transfers so that I can run N of them in parallel: This gets me N times the buffers supported (at both ends) by the underlying ssh protocol. What makes the situation complicated is that there could be more than one download going on at the same time. Right. :-) Terje -- Dave Taht http://nex-6.taht.net ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
Re: [ntp:questions] Detecting bufferbloat via ntp?
Terje Mathisen terje.mathisen at tmsw.no writes: The crux of the matter is that close to nobody actually implements [ntp prioritization], even if all involved carreres happens to support it. Getting NTP into that realtime group across the entire Internet is a non-starter. In the case of building a wide-scale bufferbloat detector/monitor, this bug may be a feature. I have seen nothing so far pass by in this discussion so far that suggests the idea is unworkable, so I've started drafting code over on github to explore it further. -- Dave Taht http://nex-6.taht.net ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
Re: [ntp:questions] Detecting bufferbloat via ntp?
John Hasler jhas...@newsguy.com writes: rick jones writes: The term may be new, but the idea, in its abstract form, has been around for years. Only recently, however, has memory become cheap enough for people to stuff megabytes of buffer into routers without considering the consequences. Also, the dynamic range of internet connections goes from 64k to 10GE now, and the exploding growth of lossy wireless links has introduced new problems not even theoretically accounted for. BTW there are queuing algorithms that work well with large buffers. Few of which appear to be in use. -- Dave Taht http://nex-6.taht.net ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
Re: [ntp:questions] Detecting bufferbloat via ntp?
E-Mail Sent to this address will be added to the BlackLists Null@BlackList.Anitech-Systems.invalid writes: Dave Täht wrote: Rick Jones rick.jon...@hp.com writes: 24 hours worth of rawstats data (if that contains enough information, judging from the discussion on the list, it does) from a given ntp pool or pools would be revealing. Would there be anyone here interested in collecting that data and poking into this issue further? Why not do that with your own NTP clients? I plan on it. https://github.com/dtaht/Cosmic-Background-Bufferbloat-Detector However whatever data set I can collect now would be dwarfed by one a larger ntp pool could provide as a starting point, exposing problems with the concept, show differences between versions of ntp, and hosts, etc. -- Dave Taht http://nex-6.taht.net ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
[ntp:questions] Detecting bufferbloat via ntp?
I've been racking my brain trying to come up with a good way of semi-passively detecting bufferbloat at the datacenter. What would wild swings in latency on the order of seconds from a ntp client register on a ntp server as? -- Dave Taht http://nex-6.taht.net ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
Re: [ntp:questions] Detecting bufferbloat via ntp?
unruh un...@wormhole.physics.ubc.ca writes: On 2011-02-08, Rick Jones rick.jon...@hp.com wrote: Dave T??ht d...@taht.net wrote: I've been racking my brain trying to come up with a good way of semi-passively detecting bufferbloat at the datacenter. What would wild swings in latency on the order of seconds from a ntp client register on a ntp server as? Trying to avoid ICMP fast paths? Once everything is stable the polling interval is going to get pretty large (1024 seconds) - watch long enough and I suppose one will see buffer bloat in the stats, but it might take quite a while to hit. You may need/want to look for it a bit more actively. Not at all clear what buffer bloat is supposed to be. This seems to implyu that it is buffer leakage-- ie the buffer keeps growing because stuff is not properly removed from the buffer. The pointers on the we seem to imply that the program assigns buffers which are far too large and that therefor the buffers will need to get paged in slowing everything down. Gettys explains it at about 750 milli-lampsons and a presentation in about 20 minutes here: http://mirrors.bufferbloat.net/Talks/BellLabs01192011/murray_hill01192011_Bufferbloat_Talk_Edited_For_brevity.mp3 + http://mirrors.bufferbloat.net/Talks/BellLabs01192011/110126141155_BufferBloat11.ppt Some of the periodic behavior he's seeing worries Van Jacobson. See slides 23-32. The network buffers in ntp are tiny. The datagram is far less than 1K. It is being queued up behind potentially thousands of other FIFO packets contained in dark, unmanaged buffers between the TCP stack and the edge gateway. I have no idea what the OP is asking-- is he afraid that the writers of ntpd were incompetent and wants to test this particular form of incompetence? Has he seen evidence that seems to imply that ntpd suffers from bufferbloat? No, I am trying to come up with a reasonably passive way to detect bufferbloat - which is only (currently) observable by doing latency under load tests on the network edge - in the data center. If the latter why not tell us the symptoms that make him suspect this. If the former, what makes him think that the programmers screwed up? I think ntp is great. I also theorize that clocks are wandering more than we might expect, due to bufferbloat. rick jones keeps forgetting if any of the interface MIBs specify an outbound queue length statistic... -- Dave Taht http://nex-6.taht.net ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
Re: [ntp:questions] Detecting bufferbloat via ntp?
unruh un...@wormhole.physics.ubc.ca writes: On 2011-02-08, Chuck Swiger cswi...@mac.com wrote: On Feb 8, 2011, at 12:05 PM, Richard B. Gilbert wrote: What would wild swings in latency on the order of seconds from a ntp client register on a ntp server as? High latency (delay column in ntpq -p output), high jitter. Regards, Why would the server even notice what the client is doing? The server does not monitor clients, it simply responds with the correct time whenever it is asked for the correct time. Now ntpq -p on the *client* would show High latency, etc. I agree that ntpq -p on the client should show this; that was what I'd meant to suggest. :-) This being said, a NTP client does indeed provide it's time to the server when making an NTP request, in a field called the Transmit Timestamp. A typical NTP exchange looks like the following: Yes, but all the server does is to copy that timestamp into the outgoing packet, puts its received timestamp into the appropriate box and when it is ready to send the packet pack, put in its own transmit timestamp Presently! It sounds like a lightly modified ntp server - or set thereof - could actually attempt to keep track of this information. Might even be possible to do it in a pf or iptables rule. # tcpdump -n -s 0 -v -v port ntp [Elided] Do ntp clients use an ephemeral udp src address for a query? -- Dave Taht http://nex-6.taht.net ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
Re: [ntp:questions] Detecting bufferbloat via ntp?
hal-use...@ip-64-139-1-69.sjc.megapath.net (Hal Murray) writes: In article 87aai6d0t6@cruithne.co.teklibre.org, d...@taht.net (Dave =?utf-8?Q?T=C3=A4ht?=) writes: I've been racking my brain trying to come up with a good way of semi-passively detecting bufferbloat at the datacenter. What would wild swings in latency on the order of seconds from a ntp client register on a ntp server as? Are you trying to detect it in real time, or collect long term data? Medium-long term study. There are studies going on via SamKnows, Georgia Tech, and the FCC at present, but they are mostly using host side measurement and active measurement tools. Turn on rawstats. (info in monopt.html) That will log 4 time stamps for each ntp packet received. If the time on both systems is accurate, that will give you a pair of one-way times. It would be good if someone (or someones) has actually been collecting rawstats for a long period, to serve as a baseline. Bufferbloat is a relatively new phenomenon. ntp tries to filter out the packets with long (queueing) delays so you will see the bad stuff in rawstats that you won't see in peerstats or ntpq -peers I was afraid of that. What qualifies as a long queueing delay? -- Dave Taht http://nex-6.taht.net ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions
Re: [ntp:questions] Detecting bufferbloat via ntp?
Rick Jones rick.jon...@hp.com writes: Dave T??ht d...@taht.net wrote: No, I am trying to come up with a reasonably passive way to detect bufferbloat - which is only (currently) observable by doing latency under load tests on the network edge - in the data center. So you are hoping that ntp's natural (albeit infrequent) latency measurements might expose it for you? With a large enough dataset, the bufferbloat signature might stand out, as it is often on the order of seconds, according to active measurement tools such as netanalyzer. Not only that, but the originating timestamp data currently being discarded might point to effective queue lengths measured in lunar distances. The noise that we've been filtering out may actually be at a dull roar. But I don't know. 24 hours worth of rawstats data (if that contains enough information, judging from the discussion on the list, it does) from a given ntp pool or pools would be revealing. Would there be anyone here interested in collecting that data and poking into this issue further? -- Dave Taht http://nex-6.taht.net ___ questions mailing list questions@lists.ntp.org http://lists.ntp.org/listinfo/questions