Re: [uknof] TCP Trainee

2015-06-28 Thread Brian Candler

If my laptop was filling it's RWIN
then sending back the ACKs it would be like one in 10
That would be a great way to kill your TCP throughput. Once an RWIN of 
data had been sent, the transmitter would be forced to stop sending. 
There would then be an RTT pause until it was able to start sending 
again - or longer, if the ACK was lost.





Re: [uknof] TCP Trainee

2015-06-27 Thread James Bensley
On 23 June 2015 at 19:04, Neil J. McRae n...@domino.org wrote:
 Stevens is your friend.

Agreed. I'll got all three kicking around. I'll have a flick through again.

Cheers,
James.



Re: [uknof] TCP Trainee

2015-06-27 Thread James Bensley
On 24 June 2015 at 10:20, David Freedman david.freed...@uk.clara.net wrote:

A quick Google Doc's spreadsheet on-the-fly calculator for TCP
throughput using RWIN size and delay is not bringing back the results
I would expect, located here [3]. I want to be able to reasonably
accurately calculate expected throughput however a few tests are
showing completely different results to what the spreadsheet would
have me expect.

When entering my RTT and RWIN sizes into the “expected RWIN” tab on
the spreadsheet it tells me I should be getting a higher throughput
than I actually am, the observed RWIN tab shows more or less what I
did get when I enter my RTT and observed RWIN (I seem to be using 2x
MSS of 1460 for RWIN which is 2920 bytes).


 Right, but what about CWIN and the whole host of other stuff which makes
 you perform differently from how you expect because it is specific to the
 stack and client machine?

 The TCP stack of modern machines is a complicated thing, where innovation
 now transcends the most popular printed material

Agreed, as I was writing that email CWIN was on my mind. I need to dig
deeper in to the behaviour of these two specific hosts and see if that
clears things up as it isn't obviously from the packet captures.



I have downloaded a 100MB test file on my colo box from another box in
the US which climbed to just over 100Mbps before the end of the file
was reached. I have uploaded the first 100 packets of the capture I
made during this transfer also, here [2].

Looking at the packet capture from my home download the Win 7 laptop
is offering an RWIN of 17520 bytes but it’s sending an ACK after every
two TCP segments received from the server (much less than the RWIN has
been received) – why has that happened?

 This is the classic delayed ACK behaviour (ack every second segment)
 and is defined in RFC1122 (s.4.2.3.2), yes, it is old, but there is
 a reason the client is doing this. Perhaps related to the latency.

Ah! Indeed! I have seen this before and forgotten all about it. I'll
re-read and some things might become clearer. Thanks.

So my main focus is the spreadsheet and trying to get the calculations
correct, if the hosts were acting as expected. Are the calculations
just plain wrong and I’m a fucking trumpet, or is this an unreliable
calculation because host do all sorts of crazy jazz that is OS/TCP
stack dependant? Ideally if I can get the calculations correct that is
what I want, host settings can be adjusted (hoepfully!).


 As above, take the crazy jazz explanation, all you can do if you
 are modelling is try to predict the best and worst outcomes with the
 data (bw, latency, jitter etc..) you have.


That is pretty much what I am after. There are too many different
operating systems and appliances in the wild and they all act very
differently to each other, even different version of network stacks on
products by the same vendor etc etc. So really I just want to make and
accurate model if you *know* all the elements in play.

Cheers,

James.



Re: [uknof] TCP Trainee

2015-06-26 Thread Keith Mitchell
On 06/25/2015 11:21 AM, Paul Mansfield wrote:
 On 25 June 2015 at 12:18, Brian Candler b.cand...@pobox.com wrote:
 One of the current perfsonar objectives is to make a low-cost probe (I think
 the target is $50-$100) which can be plugged into various points in your
 
 I imagine Ripe Atlas probes cost significantly less than $50, so if I
 were part of that project I'd look closely at how Ripe do it.

Indeed, these are a slightly grown-up version of the boxes RIPE Atlas
probes run on, will run standard openWRT nicely, and are less than 25
quid each:

http://www.amazon.co.uk/dp/B00APZZ30W/ref=olp_product_details

Ought to be tweakable for this.

Keith




Re: [uknof] TCP Trainee

2015-06-25 Thread Brian Candler

Alternatively you can just use the underlying tool directly, which for
packet loss is owamp. By default it sends 10 packets per second, and it
measures the packet loss and latency separately in each direction. (Latency
measurements require good NTP sync at both ends).

Yes, configuring owamp is a pita. Thankfully gps devices are available, but
only one usb based one I know of emits a PPS signal to condition ntp.

Smokeping is good for ongoing monitoring. Flent I hope gets added to
perfsonar, also.

Also, I am unhappy with perfsonar as it uses (when last I looked) a
totally ancient linux kernel, which is incapable of dealing with
10+GigE well, and there have been so many post-bufferbloat-era
improvements in the tcp and driver stack that I would be very
reluctant to trust any tcp measurements it takes.

perfsonar provides you with the CentOS 6.5 kernel with the web100 
patches (http://www.web100.org/), so yes it's pretty old, but it has had 
some tuning.


Someone else wrote:


The most important limitation of perfsonar-like approaches for my use is
that they require a Linux-machine at the customer-end of the connection.
This is not really feasible with 1000+ lines.


One of the current perfsonar objectives is to make a low-cost probe (I 
think the target is $50-$100) which can be plugged into various points 
in your network and will respond to the various test types. The idea was 
to make it so cheap you could put one in every wiring closet.  I don't 
know how that project is progressing, and whether it will end up being 
100M or 1G capable.


In my case, I have deployed a few test endpoints which are Mac Minis 
running OSX, and have compiled bwctl/owamp/iperf3 on them (this is now 
in homebrew).  So you can have a central perfsonar box which schedules 
tests to the Mac Minis, but the Macs can be used as normal workstations 
for the rest of the time.


perfsonar can also do normal ping tests, which it does in the same way 
as smokeping (e.g. send a burst of 20 pings every 5 minutes). It's 
better than nothing, but nowhere near as sensitive to low-level packet 
loss as owamp.


Regards,

Brian.




Re: [uknof] TCP Trainee

2015-06-25 Thread Paul Mansfield
On 25 June 2015 at 12:18, Brian Candler b.cand...@pobox.com wrote:
 One of the current perfsonar objectives is to make a low-cost probe (I think
 the target is $50-$100) which can be plugged into various points in your

I imagine Ripe Atlas probes cost significantly less than $50, so if I
were part of that project I'd look closely at how Ripe do it.



Re: [uknof] TCP Trainee

2015-06-24 Thread David Freedman

A quick Google Doc's spreadsheet on-the-fly calculator for TCP
throughput using RWIN size and delay is not bringing back the results
I would expect, located here [3]. I want to be able to reasonably
accurately calculate expected throughput however a few tests are
showing completely different results to what the spreadsheet would
have me expect.

When entering my RTT and RWIN sizes into the “expected RWIN” tab on
the spreadsheet it tells me I should be getting a higher throughput
than I actually am, the observed RWIN tab shows more or less what I
did get when I enter my RTT and observed RWIN (I seem to be using 2x
MSS of 1460 for RWIN which is 2920 bytes).


Right, but what about CWIN and the whole host of other stuff which makes
you perform differently from how you expect because it is specific to the
stack and client machine?

The TCP stack of modern machines is a complicated thing, where innovation
now transcends the most popular printed material  



I have downloaded a 100MB test file on my colo box from another box in
the US which climbed to just over 100Mbps before the end of the file
was reached. I have uploaded the first 100 packets of the capture I
made during this transfer also, here [2].

Looking at the packet capture from my home download the Win 7 laptop
is offering an RWIN of 17520 bytes but it’s sending an ACK after every
two TCP segments received from the server (much less than the RWIN has
been received) – why has that happened?

This is the classic delayed ACK behaviour (ack every second segment)
and is defined in RFC1122 (s.4.2.3.2), yes, it is old, but there is 
a reason the client is doing this. Perhaps related to the latency. 



So my main focus is the spreadsheet and trying to get the calculations
correct, if the hosts were acting as expected. Are the calculations
just plain wrong and I’m a fucking trumpet, or is this an unreliable
calculation because host do all sorts of crazy jazz that is OS/TCP
stack dependant? Ideally if I can get the calculations correct that is
what I want, host settings can be adjusted (hoepfully!).


As above, take the crazy jazz explanation, all you can do if you 
are modelling is try to predict the best and worst outcomes with the
data (bw, latency, jitter etc..) you have.

Dave. 


Re: [uknof] TCP Trainee

2015-06-24 Thread Brian Candler

I have downloaded a 100MB test file on my colo box from another box in
the US which climbed to just over 100Mbps before the end of the file
was reached.


What's the problem - what speed were you expecting from the spreadsheet?

There are a whole bunch of things which can affect the total throughput of TCP, 
including:

* Packet loss - https://en.wikipedia.org/wiki/TCP_tuning#Packet_loss
* What speed the remote box is able to send at (may be limited by CPU, 
bandwidth of remote connection etc)

Here's a real-world example. I was on an office 100Mbps fibre line. When 
downloading a file from Greece (75ms RTT away) the throughput was limited to 
about 3Mbps.  But my home FTTC line, with no packet loss, was maxing out the 
line at 67Mbps.

It turned out there was packet loss of 0.02% on the office line (i.e. only 1 in 
5000 packets dropped), which I was also able to demonstrate directly using ping 
and owamp.

The provider managed to fix the packet loss by moving the exchange end to a 
different switch port, and now the same download fills the whole 100Mbps.

Connecting to a local server using speedtest.net didn't demonstrate the problem 
- it showed ~75Mbps. This is for two reasons: speedtest.net chooses a nearby 
server with a low RTT, and the client opens four concurrent TCP streams.

Regards,

Brian.




Re: [uknof] TCP Trainee

2015-06-24 Thread Benny Amorsen


Brian Candler writes:

 It turned out there was packet loss of 0.02% on the office line (i.e.
 only 1 in 5000 packets dropped), which I was also able to demonstrate
 directly using ping and owamp.

TCP is ridiculously sensitive to packet loss at the speeds we see today.
Lines are not permitted to have any measurable packet loss except when
there is congestion.

This does not combine well with e.g. 4G. When moving a mobile device
around, it is easy to have 0.1% packet loss or more, which means the
user will never see the 100Mbps+ speeds that 4G could theoretically
offer.

Application developers are likely to make work-arounds like opening
multiple TCP streams when fetching video or even abandoning TCP
altogether.

 The provider managed to fix the packet loss by moving the exchange end
 to a different switch port, and now the same download fills the whole
 100Mbps.

I am impressed that the provider took a 0.02% packet loss seriously.
Many providers would not.

If you know of a way to measure 0.02% packet loss automatically with an
NMS, I would be happy to hear about it. The typical spread maybe 5 ping
packets used in Zabbix and similar are unlikely to show it. A
sufficiently low loss threshold would mean drowning in congestion-caused
false alarms.


/Benny




Re: [uknof] TCP Trainee

2015-06-24 Thread Benny Amorsen
Paul Mansfield writes:

 if you have Cisco at the CPE, can you use CISCO-PING-MIB?

 other router vendors must have something similar?

I can ping directly from the NMS already. Unfortunately that does not
help me distinguish between the typical congestion-caused packet drops
(customer actually uses the line they paid for) and the drops caused by
faults.

Making the ping packets high priority can help, but ultimately even if
you send one ping packet every second, you will see less than one drop
every hour on a line with a 0.2% packet loss. Is detection of such low
drop rates possible with active probes without the probes impacting user
traffic?

In a perfect world I would be able to assign each unicast packet a
number when it leaves the PE and check that every packet arrives in the
CPE. The CPE would need a small reorder buffer, but other than that it
should be feasible.

But where to stick the counter? An extra MPLS label or an extra VLAN tag
would be tempting options; most of the L2 providers should be transparent
to that. However they might try to hash on it for ECMP, which would be
rather unfortunate.


/Benny




Re: [uknof] TCP Trainee

2015-06-24 Thread Dave Taht
On Wed, Jun 24, 2015 at 3:01 AM, Brian Candler b.cand...@pobox.com wrote:
 On 24/06/2015 10:44, Benny Amorsen wrote:


 Brian Candler writes:

 It turned out there was packet loss of 0.02% on the office line (i.e.
 only 1 in 5000 packets dropped), which I was also able to demonstrate
 directly using ping and owamp.

 TCP is ridiculously sensitive to packet loss at the speeds we see today.
 Lines are not permitted to have any measurable packet loss except when
 there is congestion.

 This does not combine well with e.g. 4G. When moving a mobile device
 around, it is easy to have 0.1% packet loss or more, which means the
 user will never see the 100Mbps+ speeds that 4G could theoretically
 offer.

 Application developers are likely to make work-arounds like opening
 multiple TCP streams when fetching video or even abandoning TCP
 altogether.

 The provider managed to fix the packet loss by moving the exchange end
 to a different switch port, and now the same download fills the whole
 100Mbps.

 I am impressed that the provider took a 0.02% packet loss seriously.
 Many providers would not.

 It took several months, lots of measurements collected at our side, and
 threats of moving to a different provider.

 If you know of a way to measure 0.02% packet loss automatically with an
 NMS, I would be happy to hear about it.

 perfsonar PS toolkit. As well as very sensitive packet loss measurements, it
 can also do periodic burst transfers using iperf or similar tools to see how
 much throughput you can actually achieve.

 The stack you get with perfsonar is:

 - CentOS
 - owamp, iperf/iperf3/nuttcp [low-level testing tools]
 - bwctl [allows remote sites to request tests, ensures no two throughput
 tests at the same time]
 - esmond [database for storing the results]
 - test scheduler
 - GUI to configure test schedules and draw graphs

 esmond can be queried via a REST interface, so in principle it shouldn't be
 too hard to integrate with an NMS.

 Alternatively you can just use the underlying tool directly, which for
 packet loss is owamp. By default it sends 10 packets per second, and it
 measures the packet loss and latency separately in each direction. (Latency
 measurements require good NTP sync at both ends).

Yes, configuring owamp is a pita. Thankfully gps devices are available, but
only one usb based one I know of emits a PPS signal to condition ntp.

Smokeping is good for ongoing monitoring. Flent I hope gets added to
perfsonar, also.

Also, I am unhappy with perfsonar as it uses (when last I looked) a
totally ancient linux kernel, which is incapable of dealing with
10+GigE well, and there have been so many post-bufferbloat-era
improvements in the tcp and driver stack that I would be very
reluctant to trust any tcp measurements it takes.

 Regards,

 Brian.





-- 
Dave Täht
worldwide bufferbloat report:
http://www.dslreports.com/speedtest/results/bufferbloat
And:
What will it take to vastly improve wifi for everyone?
https://plus.google.com/u/0/explore/makewififast



Re: [uknof] TCP Trainee

2015-06-24 Thread Paul Mansfield
if you have Cisco at the CPE, can you use CISCO-PING-MIB?

other router vendors must have something similar?



[uknof] TCP Trainee

2015-06-23 Thread James Bensley
Hi All,

A quick Google Doc's spreadsheet on-the-fly calculator for TCP
throughput using RWIN size and delay is not bringing back the results
I would expect, located here [3]. I want to be able to reasonably
accurately calculate expected throughput however a few tests are
showing completely different results to what the spreadsheet would
have me expect.

When entering my RTT and RWIN sizes into the “expected RWIN” tab on
the spreadsheet it tells me I should be getting a higher throughput
than I actually am, the observed RWIN tab shows more or less what I
did get when I enter my RTT and observed RWIN (I seem to be using 2x
MSS of 1460 for RWIN which is 2920 bytes).

I have download a 25MB file from a colo box over my home ADSL whilst
running a capture in Wireshark. From home I get a consistent 18ms RTT
to the colo box. I have uploaded the first 100 packets of that capture
here [1]. My connection quickly shot up to its max speed which is
about 825KBps. No problems with the connection itself.

I have downloaded a 100MB test file on my colo box from another box in
the US which climbed to just over 100Mbps before the end of the file
was reached. I have uploaded the first 100 packets of the capture I
made during this transfer also, here [2].

Looking at the packet capture from my home download the Win 7 laptop
is offering an RWIN of 17520 bytes but it’s sending an ACK after every
two TCP segments received from the server (much less than the RWIN has
been received) – why has that happened?

My colo box runs Linux (3.2.46-1) and presumably so does the US server
as the HTTP header shows it’s also running nginx. Between those two
boxes the 2nd packet shows the same result, the client is ACK’ing back
long before the RWIN is full –Why is this happening?

So my main focus is the spreadsheet and trying to get the calculations
correct, if the hosts were acting as expected. Are the calculations
just plain wrong and I’m a fucking trumpet, or is this an unreliable
calculation because host do all sorts of crazy jazz that is OS/TCP
stack dependant? Ideally if I can get the calculations correct that is
what I want, host settings can be adjusted (hoepfully!).

Cheers,
James.

[1] https://www.cloudshark.org/captures/be31c1aa5ce1
[2] https://www.cloudshark.org/captures/5c1d6eba6b7f
[3] 
https://docs.google.com/spreadsheets/d/1XcIEip53VwIZoXtq6Lwc95UcWoZkqepGackFtLMZj-4/edit?usp=sharing



Re: [uknof] TCP Trainee

2015-06-23 Thread Neil J. McRae
Stevens is your friend.

Neil 

Sent from my iPhone

 On 23 Jun 2015, at 17:04, James Bensley jwbens...@gmail.com wrote:
 
 Hi All,
 
 A quick Google Doc's spreadsheet on-the-fly calculator for TCP
 throughput using RWIN size and delay is not bringing back the results
 I would expect, located here [3]. I want to be able to reasonably
 accurately calculate expected throughput however a few tests are
 showing completely different results to what the spreadsheet would
 have me expect.
 
 When entering my RTT and RWIN sizes into the “expected RWIN” tab on
 the spreadsheet it tells me I should be getting a higher throughput
 than I actually am, the observed RWIN tab shows more or less what I
 did get when I enter my RTT and observed RWIN (I seem to be using 2x
 MSS of 1460 for RWIN which is 2920 bytes).
 
 I have download a 25MB file from a colo box over my home ADSL whilst
 running a capture in Wireshark. From home I get a consistent 18ms RTT
 to the colo box. I have uploaded the first 100 packets of that capture
 here [1]. My connection quickly shot up to its max speed which is
 about 825KBps. No problems with the connection itself.
 
 I have downloaded a 100MB test file on my colo box from another box in
 the US which climbed to just over 100Mbps before the end of the file
 was reached. I have uploaded the first 100 packets of the capture I
 made during this transfer also, here [2].
 
 Looking at the packet capture from my home download the Win 7 laptop
 is offering an RWIN of 17520 bytes but it’s sending an ACK after every
 two TCP segments received from the server (much less than the RWIN has
 been received) – why has that happened?
 
 My colo box runs Linux (3.2.46-1) and presumably so does the US server
 as the HTTP header shows it’s also running nginx. Between those two
 boxes the 2nd packet shows the same result, the client is ACK’ing back
 long before the RWIN is full –Why is this happening?
 
 So my main focus is the spreadsheet and trying to get the calculations
 correct, if the hosts were acting as expected. Are the calculations
 just plain wrong and I’m a fucking trumpet, or is this an unreliable
 calculation because host do all sorts of crazy jazz that is OS/TCP
 stack dependant? Ideally if I can get the calculations correct that is
 what I want, host settings can be adjusted (hoepfully!).
 
 Cheers,
 James.
 
 [1] https://www.cloudshark.org/captures/be31c1aa5ce1
 [2] https://www.cloudshark.org/captures/5c1d6eba6b7f
 [3] 
 https://docs.google.com/spreadsheets/d/1XcIEip53VwIZoXtq6Lwc95UcWoZkqepGackFtLMZj-4/edit?usp=sharing