Re: [uknof] TCP Trainee
If my laptop was filling it's RWIN then sending back the ACKs it would be like one in 10 That would be a great way to kill your TCP throughput. Once an RWIN of data had been sent, the transmitter would be forced to stop sending. There would then be an RTT pause until it was able to start sending again - or longer, if the ACK was lost.
Re: [uknof] TCP Trainee
On 23 June 2015 at 19:04, Neil J. McRae n...@domino.org wrote: Stevens is your friend. Agreed. I'll got all three kicking around. I'll have a flick through again. Cheers, James.
Re: [uknof] TCP Trainee
On 24 June 2015 at 10:20, David Freedman david.freed...@uk.clara.net wrote: A quick Google Doc's spreadsheet on-the-fly calculator for TCP throughput using RWIN size and delay is not bringing back the results I would expect, located here [3]. I want to be able to reasonably accurately calculate expected throughput however a few tests are showing completely different results to what the spreadsheet would have me expect. When entering my RTT and RWIN sizes into the “expected RWIN” tab on the spreadsheet it tells me I should be getting a higher throughput than I actually am, the observed RWIN tab shows more or less what I did get when I enter my RTT and observed RWIN (I seem to be using 2x MSS of 1460 for RWIN which is 2920 bytes). Right, but what about CWIN and the whole host of other stuff which makes you perform differently from how you expect because it is specific to the stack and client machine? The TCP stack of modern machines is a complicated thing, where innovation now transcends the most popular printed material Agreed, as I was writing that email CWIN was on my mind. I need to dig deeper in to the behaviour of these two specific hosts and see if that clears things up as it isn't obviously from the packet captures. I have downloaded a 100MB test file on my colo box from another box in the US which climbed to just over 100Mbps before the end of the file was reached. I have uploaded the first 100 packets of the capture I made during this transfer also, here [2]. Looking at the packet capture from my home download the Win 7 laptop is offering an RWIN of 17520 bytes but it’s sending an ACK after every two TCP segments received from the server (much less than the RWIN has been received) – why has that happened? This is the classic delayed ACK behaviour (ack every second segment) and is defined in RFC1122 (s.4.2.3.2), yes, it is old, but there is a reason the client is doing this. Perhaps related to the latency. Ah! Indeed! I have seen this before and forgotten all about it. I'll re-read and some things might become clearer. Thanks. So my main focus is the spreadsheet and trying to get the calculations correct, if the hosts were acting as expected. Are the calculations just plain wrong and I’m a fucking trumpet, or is this an unreliable calculation because host do all sorts of crazy jazz that is OS/TCP stack dependant? Ideally if I can get the calculations correct that is what I want, host settings can be adjusted (hoepfully!). As above, take the crazy jazz explanation, all you can do if you are modelling is try to predict the best and worst outcomes with the data (bw, latency, jitter etc..) you have. That is pretty much what I am after. There are too many different operating systems and appliances in the wild and they all act very differently to each other, even different version of network stacks on products by the same vendor etc etc. So really I just want to make and accurate model if you *know* all the elements in play. Cheers, James.
Re: [uknof] TCP Trainee
On 06/25/2015 11:21 AM, Paul Mansfield wrote: On 25 June 2015 at 12:18, Brian Candler b.cand...@pobox.com wrote: One of the current perfsonar objectives is to make a low-cost probe (I think the target is $50-$100) which can be plugged into various points in your I imagine Ripe Atlas probes cost significantly less than $50, so if I were part of that project I'd look closely at how Ripe do it. Indeed, these are a slightly grown-up version of the boxes RIPE Atlas probes run on, will run standard openWRT nicely, and are less than 25 quid each: http://www.amazon.co.uk/dp/B00APZZ30W/ref=olp_product_details Ought to be tweakable for this. Keith
Re: [uknof] TCP Trainee
Alternatively you can just use the underlying tool directly, which for packet loss is owamp. By default it sends 10 packets per second, and it measures the packet loss and latency separately in each direction. (Latency measurements require good NTP sync at both ends). Yes, configuring owamp is a pita. Thankfully gps devices are available, but only one usb based one I know of emits a PPS signal to condition ntp. Smokeping is good for ongoing monitoring. Flent I hope gets added to perfsonar, also. Also, I am unhappy with perfsonar as it uses (when last I looked) a totally ancient linux kernel, which is incapable of dealing with 10+GigE well, and there have been so many post-bufferbloat-era improvements in the tcp and driver stack that I would be very reluctant to trust any tcp measurements it takes. perfsonar provides you with the CentOS 6.5 kernel with the web100 patches (http://www.web100.org/), so yes it's pretty old, but it has had some tuning. Someone else wrote: The most important limitation of perfsonar-like approaches for my use is that they require a Linux-machine at the customer-end of the connection. This is not really feasible with 1000+ lines. One of the current perfsonar objectives is to make a low-cost probe (I think the target is $50-$100) which can be plugged into various points in your network and will respond to the various test types. The idea was to make it so cheap you could put one in every wiring closet. I don't know how that project is progressing, and whether it will end up being 100M or 1G capable. In my case, I have deployed a few test endpoints which are Mac Minis running OSX, and have compiled bwctl/owamp/iperf3 on them (this is now in homebrew). So you can have a central perfsonar box which schedules tests to the Mac Minis, but the Macs can be used as normal workstations for the rest of the time. perfsonar can also do normal ping tests, which it does in the same way as smokeping (e.g. send a burst of 20 pings every 5 minutes). It's better than nothing, but nowhere near as sensitive to low-level packet loss as owamp. Regards, Brian.
Re: [uknof] TCP Trainee
On 25 June 2015 at 12:18, Brian Candler b.cand...@pobox.com wrote: One of the current perfsonar objectives is to make a low-cost probe (I think the target is $50-$100) which can be plugged into various points in your I imagine Ripe Atlas probes cost significantly less than $50, so if I were part of that project I'd look closely at how Ripe do it.
Re: [uknof] TCP Trainee
A quick Google Doc's spreadsheet on-the-fly calculator for TCP throughput using RWIN size and delay is not bringing back the results I would expect, located here [3]. I want to be able to reasonably accurately calculate expected throughput however a few tests are showing completely different results to what the spreadsheet would have me expect. When entering my RTT and RWIN sizes into the “expected RWIN” tab on the spreadsheet it tells me I should be getting a higher throughput than I actually am, the observed RWIN tab shows more or less what I did get when I enter my RTT and observed RWIN (I seem to be using 2x MSS of 1460 for RWIN which is 2920 bytes). Right, but what about CWIN and the whole host of other stuff which makes you perform differently from how you expect because it is specific to the stack and client machine? The TCP stack of modern machines is a complicated thing, where innovation now transcends the most popular printed material I have downloaded a 100MB test file on my colo box from another box in the US which climbed to just over 100Mbps before the end of the file was reached. I have uploaded the first 100 packets of the capture I made during this transfer also, here [2]. Looking at the packet capture from my home download the Win 7 laptop is offering an RWIN of 17520 bytes but it’s sending an ACK after every two TCP segments received from the server (much less than the RWIN has been received) – why has that happened? This is the classic delayed ACK behaviour (ack every second segment) and is defined in RFC1122 (s.4.2.3.2), yes, it is old, but there is a reason the client is doing this. Perhaps related to the latency. So my main focus is the spreadsheet and trying to get the calculations correct, if the hosts were acting as expected. Are the calculations just plain wrong and I’m a fucking trumpet, or is this an unreliable calculation because host do all sorts of crazy jazz that is OS/TCP stack dependant? Ideally if I can get the calculations correct that is what I want, host settings can be adjusted (hoepfully!). As above, take the crazy jazz explanation, all you can do if you are modelling is try to predict the best and worst outcomes with the data (bw, latency, jitter etc..) you have. Dave.
Re: [uknof] TCP Trainee
I have downloaded a 100MB test file on my colo box from another box in the US which climbed to just over 100Mbps before the end of the file was reached. What's the problem - what speed were you expecting from the spreadsheet? There are a whole bunch of things which can affect the total throughput of TCP, including: * Packet loss - https://en.wikipedia.org/wiki/TCP_tuning#Packet_loss * What speed the remote box is able to send at (may be limited by CPU, bandwidth of remote connection etc) Here's a real-world example. I was on an office 100Mbps fibre line. When downloading a file from Greece (75ms RTT away) the throughput was limited to about 3Mbps. But my home FTTC line, with no packet loss, was maxing out the line at 67Mbps. It turned out there was packet loss of 0.02% on the office line (i.e. only 1 in 5000 packets dropped), which I was also able to demonstrate directly using ping and owamp. The provider managed to fix the packet loss by moving the exchange end to a different switch port, and now the same download fills the whole 100Mbps. Connecting to a local server using speedtest.net didn't demonstrate the problem - it showed ~75Mbps. This is for two reasons: speedtest.net chooses a nearby server with a low RTT, and the client opens four concurrent TCP streams. Regards, Brian.
Re: [uknof] TCP Trainee
Brian Candler writes: It turned out there was packet loss of 0.02% on the office line (i.e. only 1 in 5000 packets dropped), which I was also able to demonstrate directly using ping and owamp. TCP is ridiculously sensitive to packet loss at the speeds we see today. Lines are not permitted to have any measurable packet loss except when there is congestion. This does not combine well with e.g. 4G. When moving a mobile device around, it is easy to have 0.1% packet loss or more, which means the user will never see the 100Mbps+ speeds that 4G could theoretically offer. Application developers are likely to make work-arounds like opening multiple TCP streams when fetching video or even abandoning TCP altogether. The provider managed to fix the packet loss by moving the exchange end to a different switch port, and now the same download fills the whole 100Mbps. I am impressed that the provider took a 0.02% packet loss seriously. Many providers would not. If you know of a way to measure 0.02% packet loss automatically with an NMS, I would be happy to hear about it. The typical spread maybe 5 ping packets used in Zabbix and similar are unlikely to show it. A sufficiently low loss threshold would mean drowning in congestion-caused false alarms. /Benny
Re: [uknof] TCP Trainee
Paul Mansfield writes: if you have Cisco at the CPE, can you use CISCO-PING-MIB? other router vendors must have something similar? I can ping directly from the NMS already. Unfortunately that does not help me distinguish between the typical congestion-caused packet drops (customer actually uses the line they paid for) and the drops caused by faults. Making the ping packets high priority can help, but ultimately even if you send one ping packet every second, you will see less than one drop every hour on a line with a 0.2% packet loss. Is detection of such low drop rates possible with active probes without the probes impacting user traffic? In a perfect world I would be able to assign each unicast packet a number when it leaves the PE and check that every packet arrives in the CPE. The CPE would need a small reorder buffer, but other than that it should be feasible. But where to stick the counter? An extra MPLS label or an extra VLAN tag would be tempting options; most of the L2 providers should be transparent to that. However they might try to hash on it for ECMP, which would be rather unfortunate. /Benny
Re: [uknof] TCP Trainee
On Wed, Jun 24, 2015 at 3:01 AM, Brian Candler b.cand...@pobox.com wrote: On 24/06/2015 10:44, Benny Amorsen wrote: Brian Candler writes: It turned out there was packet loss of 0.02% on the office line (i.e. only 1 in 5000 packets dropped), which I was also able to demonstrate directly using ping and owamp. TCP is ridiculously sensitive to packet loss at the speeds we see today. Lines are not permitted to have any measurable packet loss except when there is congestion. This does not combine well with e.g. 4G. When moving a mobile device around, it is easy to have 0.1% packet loss or more, which means the user will never see the 100Mbps+ speeds that 4G could theoretically offer. Application developers are likely to make work-arounds like opening multiple TCP streams when fetching video or even abandoning TCP altogether. The provider managed to fix the packet loss by moving the exchange end to a different switch port, and now the same download fills the whole 100Mbps. I am impressed that the provider took a 0.02% packet loss seriously. Many providers would not. It took several months, lots of measurements collected at our side, and threats of moving to a different provider. If you know of a way to measure 0.02% packet loss automatically with an NMS, I would be happy to hear about it. perfsonar PS toolkit. As well as very sensitive packet loss measurements, it can also do periodic burst transfers using iperf or similar tools to see how much throughput you can actually achieve. The stack you get with perfsonar is: - CentOS - owamp, iperf/iperf3/nuttcp [low-level testing tools] - bwctl [allows remote sites to request tests, ensures no two throughput tests at the same time] - esmond [database for storing the results] - test scheduler - GUI to configure test schedules and draw graphs esmond can be queried via a REST interface, so in principle it shouldn't be too hard to integrate with an NMS. Alternatively you can just use the underlying tool directly, which for packet loss is owamp. By default it sends 10 packets per second, and it measures the packet loss and latency separately in each direction. (Latency measurements require good NTP sync at both ends). Yes, configuring owamp is a pita. Thankfully gps devices are available, but only one usb based one I know of emits a PPS signal to condition ntp. Smokeping is good for ongoing monitoring. Flent I hope gets added to perfsonar, also. Also, I am unhappy with perfsonar as it uses (when last I looked) a totally ancient linux kernel, which is incapable of dealing with 10+GigE well, and there have been so many post-bufferbloat-era improvements in the tcp and driver stack that I would be very reluctant to trust any tcp measurements it takes. Regards, Brian. -- Dave Täht worldwide bufferbloat report: http://www.dslreports.com/speedtest/results/bufferbloat And: What will it take to vastly improve wifi for everyone? https://plus.google.com/u/0/explore/makewififast
Re: [uknof] TCP Trainee
if you have Cisco at the CPE, can you use CISCO-PING-MIB? other router vendors must have something similar?
[uknof] TCP Trainee
Hi All, A quick Google Doc's spreadsheet on-the-fly calculator for TCP throughput using RWIN size and delay is not bringing back the results I would expect, located here [3]. I want to be able to reasonably accurately calculate expected throughput however a few tests are showing completely different results to what the spreadsheet would have me expect. When entering my RTT and RWIN sizes into the “expected RWIN” tab on the spreadsheet it tells me I should be getting a higher throughput than I actually am, the observed RWIN tab shows more or less what I did get when I enter my RTT and observed RWIN (I seem to be using 2x MSS of 1460 for RWIN which is 2920 bytes). I have download a 25MB file from a colo box over my home ADSL whilst running a capture in Wireshark. From home I get a consistent 18ms RTT to the colo box. I have uploaded the first 100 packets of that capture here [1]. My connection quickly shot up to its max speed which is about 825KBps. No problems with the connection itself. I have downloaded a 100MB test file on my colo box from another box in the US which climbed to just over 100Mbps before the end of the file was reached. I have uploaded the first 100 packets of the capture I made during this transfer also, here [2]. Looking at the packet capture from my home download the Win 7 laptop is offering an RWIN of 17520 bytes but it’s sending an ACK after every two TCP segments received from the server (much less than the RWIN has been received) – why has that happened? My colo box runs Linux (3.2.46-1) and presumably so does the US server as the HTTP header shows it’s also running nginx. Between those two boxes the 2nd packet shows the same result, the client is ACK’ing back long before the RWIN is full –Why is this happening? So my main focus is the spreadsheet and trying to get the calculations correct, if the hosts were acting as expected. Are the calculations just plain wrong and I’m a fucking trumpet, or is this an unreliable calculation because host do all sorts of crazy jazz that is OS/TCP stack dependant? Ideally if I can get the calculations correct that is what I want, host settings can be adjusted (hoepfully!). Cheers, James. [1] https://www.cloudshark.org/captures/be31c1aa5ce1 [2] https://www.cloudshark.org/captures/5c1d6eba6b7f [3] https://docs.google.com/spreadsheets/d/1XcIEip53VwIZoXtq6Lwc95UcWoZkqepGackFtLMZj-4/edit?usp=sharing
Re: [uknof] TCP Trainee
Stevens is your friend. Neil Sent from my iPhone On 23 Jun 2015, at 17:04, James Bensley jwbens...@gmail.com wrote: Hi All, A quick Google Doc's spreadsheet on-the-fly calculator for TCP throughput using RWIN size and delay is not bringing back the results I would expect, located here [3]. I want to be able to reasonably accurately calculate expected throughput however a few tests are showing completely different results to what the spreadsheet would have me expect. When entering my RTT and RWIN sizes into the “expected RWIN” tab on the spreadsheet it tells me I should be getting a higher throughput than I actually am, the observed RWIN tab shows more or less what I did get when I enter my RTT and observed RWIN (I seem to be using 2x MSS of 1460 for RWIN which is 2920 bytes). I have download a 25MB file from a colo box over my home ADSL whilst running a capture in Wireshark. From home I get a consistent 18ms RTT to the colo box. I have uploaded the first 100 packets of that capture here [1]. My connection quickly shot up to its max speed which is about 825KBps. No problems with the connection itself. I have downloaded a 100MB test file on my colo box from another box in the US which climbed to just over 100Mbps before the end of the file was reached. I have uploaded the first 100 packets of the capture I made during this transfer also, here [2]. Looking at the packet capture from my home download the Win 7 laptop is offering an RWIN of 17520 bytes but it’s sending an ACK after every two TCP segments received from the server (much less than the RWIN has been received) – why has that happened? My colo box runs Linux (3.2.46-1) and presumably so does the US server as the HTTP header shows it’s also running nginx. Between those two boxes the 2nd packet shows the same result, the client is ACK’ing back long before the RWIN is full –Why is this happening? So my main focus is the spreadsheet and trying to get the calculations correct, if the hosts were acting as expected. Are the calculations just plain wrong and I’m a fucking trumpet, or is this an unreliable calculation because host do all sorts of crazy jazz that is OS/TCP stack dependant? Ideally if I can get the calculations correct that is what I want, host settings can be adjusted (hoepfully!). Cheers, James. [1] https://www.cloudshark.org/captures/be31c1aa5ce1 [2] https://www.cloudshark.org/captures/5c1d6eba6b7f [3] https://docs.google.com/spreadsheets/d/1XcIEip53VwIZoXtq6Lwc95UcWoZkqepGackFtLMZj-4/edit?usp=sharing