Re: TCP Performance
I mean programatically speaking your network equipment generally knows no difference between and HTTP packet and an IPerf packet. (Layer 3 packet forwarding only breaks the first 84 bits off the header, Layer 2 gets 52 bits (with a vlan tag)) So, unless QoS of some kind gets brought into the picture the protocol shouldnt make a difference, however, if something is examining the packets further than that it could also be causing your throughput issues. Also, keep in mind some switches ship with QoS enabled for VoIP etc. Might be something to check into further. On Tue, Sep 3, 2013 at 3:18 AM, Bryan Tong wrote: > Try your iperf over port 80 and see if your hitting any website related > filters. At least rule it out. > > Or try HTTP on a different port. > > If your iperf test is getting link speed then you can rule most things > connection related. I really think some device is QoS'ing packets bound > to<>from port 80. > > > On Tue, Aug 27, 2013 at 12:22 PM, Nick Olsen wrote: > >> I have indeed tried that. And it didn't make any difference. Functionally >> limiting each router port to is connected microwave links capacity. And >> queuing the overflow. However the queue never really fills as the traffic >> rate never goes higher then the allocated bandwidth. >> >> Nick Olsen >> Network Operations (855) FLSPEED x106 >> >> >> From: "Blake Dunlap" >> Sent: Tuesday, August 27, 2013 1:32 PM >> To: n...@flhsi.com >> Cc: "Tim Warnock" , "nanog@nanog.org" > > >> Subject: Re: TCP Performance >> >> If you have a router, you can turn on shaping to the bandwidth the link >> will support. >> >> -Blake >> >> On Tue, Aug 27, 2013 at 12:11 PM, Nick Olsen wrote: >> I do indeed have stats for "TX Pause Frames" And they do increment. >> However, Our router is ignoring them since it doesn't support flow >> control. >> >> I guess my next question would be. In the scenario where we insert a >> switch >> between the radio and the router that does support flow control. Are we >> not >> only moving where the overflow is going to occur? Will we not see the >> router still burst traffic at line rate toward the switch, Which then >> buffer overflows sending to the radio on account of it receiving pause >> frames? >> >> Nick Olsen >> Network Operations (855) FLSPEED x106 >> >> >> From: "Tim Warnock" >> Sent: Tuesday, August 27, 2013 1:08 PM >> To: "Blake Dunlap" , "n...@flhsi.com" >> Cc: "nanog@nanog.org" >> Subject: RE: TCP Performance >> >> > Regardless, your problem looks like either tail drops or packet loss, >> which >> > you showed originally. The task is to find out where this is occurring, >> and >> > which of the two it is. If you want to confirm what is going on, there >> are >> > some great bandwidth calculators on the internet which will show you >> what >> > bandwidth you can get with a given ms delay and % packet loss. >> > >> > As far as flow control, its really outside the scope. If you ever need >> flow >> > control, there is usually a specific reason like FCoE, and if not, it's >> > generally better to just fix the backplane congestion issue if you can, >> > than ever worry about using FC. The problem with FC isn't node to node, >> its >> > when you have node to node to node with additional devices, it isn't >> smart >> > enough to discriminate, and can crater your network 3 devices over when >> it >> > would be much better to just lose a few packets. >> > >> > -Blake >> >> In my experience - if you're traversing licenced microwave links as >> indicated flow control will definitely need to be ON. >> >> Check the radio modem stats to confirm but - if you're seeing lots of >> drops >> there you're overflowing the buffers on the radio modem. >> >> >> > > > -- > > Bryan Tong > Nullivex LLC | eSited LLC > (507) 298-1624 > -- Bryan Tong Nullivex LLC | eSited LLC (507) 298-1624
Re: TCP Performance
Try your iperf over port 80 and see if your hitting any website related filters. At least rule it out. Or try HTTP on a different port. If your iperf test is getting link speed then you can rule most things connection related. I really think some device is QoS'ing packets bound to<>from port 80. On Tue, Aug 27, 2013 at 12:22 PM, Nick Olsen wrote: > I have indeed tried that. And it didn't make any difference. Functionally > limiting each router port to is connected microwave links capacity. And > queuing the overflow. However the queue never really fills as the traffic > rate never goes higher then the allocated bandwidth. > > Nick Olsen > Network Operations (855) FLSPEED x106 > > > From: "Blake Dunlap" > Sent: Tuesday, August 27, 2013 1:32 PM > To: n...@flhsi.com > Cc: "Tim Warnock" , "nanog@nanog.org" > Subject: Re: TCP Performance > > If you have a router, you can turn on shaping to the bandwidth the link > will support. > > -Blake > > On Tue, Aug 27, 2013 at 12:11 PM, Nick Olsen wrote: > I do indeed have stats for "TX Pause Frames" And they do increment. > However, Our router is ignoring them since it doesn't support flow control. > > I guess my next question would be. In the scenario where we insert a switch > between the radio and the router that does support flow control. Are we not > only moving where the overflow is going to occur? Will we not see the > router still burst traffic at line rate toward the switch, Which then > buffer overflows sending to the radio on account of it receiving pause > frames? > > Nick Olsen > Network Operations (855) FLSPEED x106 > > ------------ > From: "Tim Warnock" > Sent: Tuesday, August 27, 2013 1:08 PM > To: "Blake Dunlap" , "n...@flhsi.com" > Cc: "nanog@nanog.org" > Subject: RE: TCP Performance > > > Regardless, your problem looks like either tail drops or packet loss, > which > > you showed originally. The task is to find out where this is occurring, > and > > which of the two it is. If you want to confirm what is going on, there > are > > some great bandwidth calculators on the internet which will show you > what > > bandwidth you can get with a given ms delay and % packet loss. > > > > As far as flow control, its really outside the scope. If you ever need > flow > > control, there is usually a specific reason like FCoE, and if not, it's > > generally better to just fix the backplane congestion issue if you can, > > than ever worry about using FC. The problem with FC isn't node to node, > its > > when you have node to node to node with additional devices, it isn't > smart > > enough to discriminate, and can crater your network 3 devices over when > it > > would be much better to just lose a few packets. > > > > -Blake > > In my experience - if you're traversing licenced microwave links as > indicated flow control will definitely need to be ON. > > Check the radio modem stats to confirm but - if you're seeing lots of drops > there you're overflowing the buffers on the radio modem. > > > -- Bryan Tong Nullivex LLC | eSited LLC (507) 298-1624
Re: TCP Performance
I have indeed tried that. And it didn't make any difference. Functionally limiting each router port to is connected microwave links capacity. And queuing the overflow. However the queue never really fills as the traffic rate never goes higher then the allocated bandwidth. Nick Olsen Network Operations (855) FLSPEED x106 From: "Blake Dunlap" Sent: Tuesday, August 27, 2013 1:32 PM To: n...@flhsi.com Cc: "Tim Warnock" , "nanog@nanog.org" Subject: Re: TCP Performance If you have a router, you can turn on shaping to the bandwidth the link will support. -Blake On Tue, Aug 27, 2013 at 12:11 PM, Nick Olsen wrote: I do indeed have stats for "TX Pause Frames" And they do increment. However, Our router is ignoring them since it doesn't support flow control. I guess my next question would be. In the scenario where we insert a switch between the radio and the router that does support flow control. Are we not only moving where the overflow is going to occur? Will we not see the router still burst traffic at line rate toward the switch, Which then buffer overflows sending to the radio on account of it receiving pause frames? Nick Olsen Network Operations (855) FLSPEED x106 From: "Tim Warnock" Sent: Tuesday, August 27, 2013 1:08 PM To: "Blake Dunlap" , "n...@flhsi.com" Cc: "nanog@nanog.org" Subject: RE: TCP Performance > Regardless, your problem looks like either tail drops or packet loss, which > you showed originally. The task is to find out where this is occurring, and > which of the two it is. If you want to confirm what is going on, there are > some great bandwidth calculators on the internet which will show you what > bandwidth you can get with a given ms delay and % packet loss. > > As far as flow control, its really outside the scope. If you ever need flow > control, there is usually a specific reason like FCoE, and if not, it's > generally better to just fix the backplane congestion issue if you can, > than ever worry about using FC. The problem with FC isn't node to node, its > when you have node to node to node with additional devices, it isn't smart > enough to discriminate, and can crater your network 3 devices over when it > would be much better to just lose a few packets. > > -Blake In my experience - if you're traversing licenced microwave links as indicated flow control will definitely need to be ON. Check the radio modem stats to confirm but - if you're seeing lots of drops there you're overflowing the buffers on the radio modem.
RE: TCP Performance
I do indeed have stats for "TX Pause Frames" And they do increment. However, Our router is ignoring them since it doesn't support flow control. I guess my next question would be. In the scenario where we insert a switch between the radio and the router that does support flow control. Are we not only moving where the overflow is going to occur? Will we not see the router still burst traffic at line rate toward the switch, Which then buffer overflows sending to the radio on account of it receiving pause frames? Nick Olsen Network Operations (855) FLSPEED x106 From: "Tim Warnock" Sent: Tuesday, August 27, 2013 1:08 PM To: "Blake Dunlap" , "n...@flhsi.com" Cc: "nanog@nanog.org" Subject: RE: TCP Performance > Regardless, your problem looks like either tail drops or packet loss, which > you showed originally. The task is to find out where this is occurring, and > which of the two it is. If you want to confirm what is going on, there are > some great bandwidth calculators on the internet which will show you what > bandwidth you can get with a given ms delay and % packet loss. > > As far as flow control, its really outside the scope. If you ever need flow > control, there is usually a specific reason like FCoE, and if not, it's > generally better to just fix the backplane congestion issue if you can, > than ever worry about using FC. The problem with FC isn't node to node, its > when you have node to node to node with additional devices, it isn't smart > enough to discriminate, and can crater your network 3 devices over when it > would be much better to just lose a few packets. > > -Blake In my experience - if you're traversing licenced microwave links as indicated flow control will definitely need to be ON. Check the radio modem stats to confirm but - if you're seeing lots of drops there you're overflowing the buffers on the radio modem.
Re: TCP Performance
No QoS is in use anywhere.. To the best of my ability I've eliminated Packet loss. However, I've not found a way any better than ICMP/MTR/Ping -f..etc. The reason flow control has been mentioned is to correct buffer overflow at the Microwave links. Where they physically link at GigFDX. But the radio interface is only capable of ~360Mb/s, It's possible for the sending device to overflow the buffer between the fiber/ethernet and the radio interface.I can say we've had an issue like this in the past, Which forcing 100Mb/s FDX on a licensed radio fixed the problem. Being that, The ethernet was now slower then the radio interface. However, The down fall of this is that it limits the link to 100Mb/s which isn't sufficient anymore. In terms of congestion, There is not from my point of view. Every link in questions runs =>30% utilization. Nick Olsen Network Operations (855) FLSPEED x106 From: "Blake Dunlap" Sent: Tuesday, August 27, 2013 11:42 AM To: n...@flhsi.com Cc: na...@thedaileyplanet.com, "nanog@nanog.org" Subject: Re: TCP Performance This really sounds like you aren't testing the correct flow type in i/jperf, or you have some QoS queues for http traffic but not the perf traffic that are filled. Regardless, your problem looks like either tail drops or packet loss, which you showed originally. The task is to find out where this is occurring, and which of the two it is. If you want to confirm what is going on, there are some great bandwidth calculators on the internet which will show you what bandwidth you can get with a given ms delay and % packet loss. As far as flow control, its really outside the scope. If you ever need flow control, there is usually a specific reason like FCoE, and if not, it's generally better to just fix the backplane congestion issue if you can, than ever worry about using FC. The problem with FC isn't node to node, its when you have node to node to node with additional devices, it isn't smart enough to discriminate, and can crater your network 3 devices over when it would be much better to just lose a few packets. -Blake On Tue, Aug 27, 2013 at 9:49 AM, Nick Olsen wrote: Duplex mismatch has been checked across the board. On every device. Nick Olsen Network Operations (855) FLSPEED x106 From: "Chad Dailey" Sent: Tuesday, August 27, 2013 10:48 AM To: n...@flhsi.com Subject: Re: TCP Performance Check for duplex mismatch at the server. On Mon, Aug 26, 2013 at 2:07 PM, Nick Olsen wrote: Greetings all, I've got an issue I was hoping to put a few more eyes on. Here's the scenario. Downloading a file at our Border is multiple orders of magnitude faster then a few hops out. Using the same 128MB test file, I tested at two different locations. As well as between them. Using multiple connections improves throughput, However it's the single stream issue we're looking at right now. All testing servers in question are Centos Linux. Orlando Datapath: Cogent>Orlando Border Router (Mikrotik)>HP Procurve Switch> Server Results: 2013-08-29 05:04:09 (52.6 MB/s) - `128mbfile.tgz' saved [127926272/127926272] Cocoa NOC Datapath: Cogent>Orlando Border Router (Mikrotik)>Licensed Microwave Link (300+Mb/s Capacity)>East Orange Router (Mikrotik)> Licensed Microwave Link (300+Mb/s Capacity)>Cocoa Router (Mikrotik)>Licensed Microwave Link (300+Mb/s Capacity)>Colo Router (Mikrotik)>NOC Router (Mikrotik)>HP Procurve Switch>Server Results: 2013-08-26 13:42:25 (398 KB/s) - `128mbfile.tgz' saved [127926272/127926272] Orlando-Cocoa NOC Datapath: Orlando Server>HP Procurve Switch>Orlando Border Router (Mikrotik)>Licensed Microwave Link (300+Mb/s Capacity)>East Orange Router (Mikrotik)> Licensed Microwave Link (300+Mb/s Capacity)>Cocoa Router (Mikrotik)>Licensed Microwave Link (300+Mb/s Capacity)>Colo Router (Mikrotik)>NOC Router(Mikrotik)>HP Procurve Switch>ServerResults: 2013-08-26 13:56:25 (3.31 MB/s) - `128mbfile.tgz' saved [134217728/134217728] Now, For the fun of it. I ran Iperf single TCP between our Cocoa and Orlando POP's. Just like the HTTP test above. (Server has a 100Mb/s port). It maxes out the port, Unlike the HTTP test. [root@ded01 ~]# iperf -c 208.90.219.18Cli ent connecting to 208.90.219.18, TCP port 5001TCP window size: 16.0 KByte (default)[ 3] local 206.208.56.130 port 47281 connected with 208.90.219.18 port 5001[ ID] Interval Transfer Bandwidth[ 3] 0.0-10.0 sec 114 MBytes 95.7 Mbits/sec Here's associated packet captures for each transfer. As well as full wget output and traceroutes for each test. As you can see, The tests crossing the wirel
Re: TCP Performance
If you have a router, you can turn on shaping to the bandwidth the link will support. -Blake On Tue, Aug 27, 2013 at 12:11 PM, Nick Olsen wrote: > I do indeed have stats for "TX Pause Frames" And they do increment. > However, Our router is ignoring them since it doesn't support flow control. > > I guess my next question would be. In the scenario where we insert a > switch between the radio and the router that does support flow control. Are > we not only moving where the overflow is going to occur? Will we not see > the router still burst traffic at line rate toward the switch, Which then > buffer overflows sending to the radio on account of it receiving pause > frames? > > > Nick Olsen > Network Operations > (855) FLSPEED x106 > > > > -- > *From*: "Tim Warnock" > *Sent*: Tuesday, August 27, 2013 1:08 PM > *To*: "Blake Dunlap" , "n...@flhsi.com" > *Cc*: "nanog@nanog.org" > *Subject*: RE: TCP Performance > > > > Regardless, your problem looks like either tail drops or packet loss, > which > > you showed originally. The task is to find out where this is occurring, > and > > which of the two it is. If you want to confirm what is going on, there > are > > some great bandwidth calculators on the internet which will show you what > > bandwidth you can get with a given ms delay and % packet loss. > > > > As far as flow control, its really outside the scope. If you ever need > flow > > control, there is usually a specific reason like FCoE, and if not, it's > > generally better to just fix the backplane congestion issue if you can, > > than ever worry about using FC. The problem with FC isn't node to node, > its > > when you have node to node to node with additional devices, it isn't > smart > > enough to discriminate, and can crater your network 3 devices over when > it > > would be much better to just lose a few packets. > > > > -Blake > > In my experience - if you're traversing licenced microwave links as > indicated flow control will definitely need to be ON. > > Check the radio modem stats to confirm but - if you're seeing lots of > drops there you're overflowing the buffers on the radio modem. > >
RE: TCP Performance
> Regardless, your problem looks like either tail drops or packet loss, which > you showed originally. The task is to find out where this is occurring, and > which of the two it is. If you want to confirm what is going on, there are > some great bandwidth calculators on the internet which will show you what > bandwidth you can get with a given ms delay and % packet loss. > > As far as flow control, its really outside the scope. If you ever need flow > control, there is usually a specific reason like FCoE, and if not, it's > generally better to just fix the backplane congestion issue if you can, > than ever worry about using FC. The problem with FC isn't node to node, its > when you have node to node to node with additional devices, it isn't smart > enough to discriminate, and can crater your network 3 devices over when it > would be much better to just lose a few packets. > > -Blake In my experience - if you're traversing licenced microwave links as indicated flow control will definitely need to be ON. Check the radio modem stats to confirm but - if you're seeing lots of drops there you're overflowing the buffers on the radio modem.
Re: TCP Performance
This really sounds like you aren't testing the correct flow type in i/jperf, or you have some QoS queues for http traffic but not the perf traffic that are filled. Regardless, your problem looks like either tail drops or packet loss, which you showed originally. The task is to find out where this is occurring, and which of the two it is. If you want to confirm what is going on, there are some great bandwidth calculators on the internet which will show you what bandwidth you can get with a given ms delay and % packet loss. As far as flow control, its really outside the scope. If you ever need flow control, there is usually a specific reason like FCoE, and if not, it's generally better to just fix the backplane congestion issue if you can, than ever worry about using FC. The problem with FC isn't node to node, its when you have node to node to node with additional devices, it isn't smart enough to discriminate, and can crater your network 3 devices over when it would be much better to just lose a few packets. -Blake On Tue, Aug 27, 2013 at 9:49 AM, Nick Olsen wrote: > Duplex mismatch has been checked across the board. On every device. > > Nick Olsen > Network Operations (855) FLSPEED x106 > > > From: "Chad Dailey" > Sent: Tuesday, August 27, 2013 10:48 AM > To: n...@flhsi.com > Subject: Re: TCP Performance > > Check for duplex mismatch at the server. > > On Mon, Aug 26, 2013 at 2:07 PM, Nick Olsen wrote: > Greetings all, I've got an issue I was hoping to put a few more eyes on. > Here's the scenario. Downloading a file at our Border is multiple orders > of magnitude faster then a few hops out. Using the same 128MB test file, I > tested at two different locations. As well as between them. Using multiple > connections improves throughput, However it's the single stream issue > we're > looking at right now. All testing servers in question are Centos Linux. > Orlando Datapath: Cogent>Orlando Border Router (Mikrotik)>HP Procurve > Switch> Server Results: 2013-08-29 05:04:09 (52.6 MB/s) - `128mbfile.tgz' > saved [127926272/127926272] > Cocoa NOC Datapath: Cogent>Orlando Border Router (Mikrotik)>Licensed > Microwave Link (300+Mb/s Capacity)>East Orange Router (Mikrotik)> Licensed > Microwave Link (300+Mb/s Capacity)>Cocoa Router (Mikrotik)>Licensed > Microwave Link (300+Mb/s Capacity)>Colo Router (Mikrotik)>NOC Router > (Mikrotik)>HP Procurve Switch>Server Results: 2013-08-26 13:42:25 (398 > KB/s) - `128mbfile.tgz' saved [127926272/127926272] > Orlando-Cocoa NOC Datapath: Orlando Server>HP Procurve Switch>Orlando > Border Router (Mikrotik)>Licensed Microwave Link (300+Mb/s Capacity)>East > Orange Router (Mikrotik)> Licensed Microwave Link (300+Mb/s > Capacity)>Cocoa > Router (Mikrotik)>Licensed Microwave Link (300+Mb/s Capacity)>Colo Router > (Mikrotik)>NOC Router(Mikrotik)>HP Procurve Switch>ServerResults: > 2013-08-26 13:56:25 (3.31 MB/s) - `128mbfile.tgz' saved > [134217728/134217728] > Now, For the fun of it. I ran Iperf single TCP between our Cocoa and > Orlando POP's. Just like the HTTP test above. (Server has a 100Mb/s port). > It maxes out the port, Unlike the HTTP test. > [root@ded01 ~]# iperf -c > 208.90.219.18 > Cli > > ent connecting to 208.90.219.18, TCP port 5001TCP window size: 16.0 KByte > (default)[ 3] > local 206.208.56.130 port 47281 connected with 208.90.219.18 port 5001[ > ID] > Interval Transfer Bandwidth[ 3] 0.0-10.0 sec 114 MBytes > 95.7 > Mbits/sec > > Here's associated packet captures for each transfer. As well as full wget > output and traceroutes for each test. As you can see, The tests crossing > the wireless links show about 3x more TCP re-transmits/dup ACK's. But I'm > not sure I'm sold this could show such a huge drop in throughput. Other > then that, nothing really stands out to me as to why these transfers are > so > much slower. Intra-network iperf testing shows full throughput the whole > way with single connection. As well as UDP testing. One thing to note is > the Iperf testing has far less TCP re-transmit/dup acks then any of the > HTTP testing, Crossing the same Microwave Links and routers. > http://cdn.141networks.com/files/captures.zip > I appreciate any insight anyone might have. Thanks! > Nick Olsen > Network Operations (855) FLSPEED x106 > > >
Re: TCP Performance
I have done a decent amount of reading on both TCP windowing and Flow Control. But I've seen a lot of conflicting data. Some say flow control breaks more then it fixes. Where some say it's completely required. Currently we do not have Flow control enabled. Our routers do not support flow control currently (At least, Not at a configurable level, maybe at the NIC hardware wise). The only way we could currently implement flow control would be installing a manged switch (with flow control) between the router(s) and the Microwave links. Regarding packet loss. We once again have conflicting data. If you take a look at the packet captures. The file download in Orlando (Which rocks ~800Mb/) shows ~5K retransmits/Dup Acks. However the file download in Cocoa (Crossing the wireless) is about 3x that (~16K retransmits/dup acks). The same is shown on an intra-network test from server to server.. But only when HTTP. Iperf testing shows ~18 errors, Vs ~13K errors when HTTP based. Nick Olsen Network Operations (855) FLSPEED x106 From: "Blake Dunlap" Sent: Tuesday, August 27, 2013 10:32 AM To: n...@flhsi.com Cc: "nanog@nanog.org" Subject: Re: TCP Performance You didn't indicate this, but do you understand how TCP windowing works? This conversation can go two very different ways depending on the answer. To me, it looks like this is what you'd expect, and you need to fix your packet loss issues, which possibly might be QoS settings related (but it's hard to tell based on the information given). -Blake On Mon, Aug 26, 2013 at 2:07 PM, Nick Olsen wrote: Greetings all, I've got an issue I was hoping to put a few more eyes on. Here's the scenario. Downloading a file at our Border is multiple orders of magnitude faster then a few hops out. Using the same 128MB test file, I tested at two different locations. As well as between them. Using multiple connections improves throughput, However it's the single stream issue we're looking at right now. All testing servers in question are Centos Linux. Orlando Datapath: Cogent>Orlando Border Router (Mikrotik)>HP Procurve Switch> Server Results: 2013-08-29 05:04:09 (52.6 MB/s) - `128mbfile.tgz' saved [127926272/127926272] Cocoa NOC Datapath: Cogent>Orlando Border Router (Mikrotik)>Licensed Microwave Link (300+Mb/s Capacity)>East Orange Router (Mikrotik)> Licensed Microwave Link (300+Mb/s Capacity)>Cocoa Router (Mikrotik)>Licensed Microwave Link (300+Mb/s Capacity)>Colo Router (Mikrotik)>NOC Router (Mikrotik)>HP Procurve Switch>Server Results: 2013-08-26 13:42:25 (398 KB/s) - `128mbfile.tgz' saved [127926272/127926272] Orlando-Cocoa NOC Datapath: Orlando Server>HP Procurve Switch>Orlando Border Router (Mikrotik)>Licensed Microwave Link (300+Mb/s Capacity)>East Orange Router (Mikrotik)> Licensed Microwave Link (300+Mb/s Capacity)>Cocoa Router (Mikrotik)>Licensed Microwave Link (300+Mb/s Capacity)>Colo Router (Mikrotik)>NOC Router(Mikrotik)>HP Procurve Switch>ServerResults: 2013-08-26 13:56:25 (3.31 MB/s) - `128mbfile.tgz' saved [134217728/134217728] Now, For the fun of it. I ran Iperf single TCP between our Cocoa and Orlando POP's. Just like the HTTP test above. (Server has a 100Mb/s port). It maxes out the port, Unlike the HTTP test. [root@ded01 ~]# iperf -c 208.90.219.18Cli ent connecting to 208.90.219.18, TCP port 5001TCP window size: 16.0 KByte (default)[ 3] local 206.208.56.130 port 47281 connected with 208.90.219.18 port 5001[ ID] Interval Transfer Bandwidth[ 3] 0.0-10.0 sec 114 MBytes 95.7 Mbits/sec Here's associated packet captures for each transfer. As well as full wget output and traceroutes for each test. As you can see, The tests crossing the wireless links show about 3x more TCP re-transmits/dup ACK's. But I'm not sure I'm sold this could show such a huge drop in throughput. Other then that, nothing really stands out to me as to why these transfers are so much slower. Intra-network iperf testing shows full throughput the whole way with single connection. As well as UDP testing. One thing to note is the Iperf testing has far less TCP re-transmit/dup acks then any of the HTTP testing, Crossing the same Microwave Links and routers. http://cdn.141networks.com/files/captures.zip I appreciate any insight anyone might have. Thanks! Nick Olsen Network Operations (855) FLSPEED x106
Re: TCP Performance
Duplex mismatch has been checked across the board. On every device. Nick Olsen Network Operations (855) FLSPEED x106 From: "Chad Dailey" Sent: Tuesday, August 27, 2013 10:48 AM To: n...@flhsi.com Subject: Re: TCP Performance Check for duplex mismatch at the server. On Mon, Aug 26, 2013 at 2:07 PM, Nick Olsen wrote: Greetings all, I've got an issue I was hoping to put a few more eyes on. Here's the scenario. Downloading a file at our Border is multiple orders of magnitude faster then a few hops out. Using the same 128MB test file, I tested at two different locations. As well as between them. Using multiple connections improves throughput, However it's the single stream issue we're looking at right now. All testing servers in question are Centos Linux. Orlando Datapath: Cogent>Orlando Border Router (Mikrotik)>HP Procurve Switch> Server Results: 2013-08-29 05:04:09 (52.6 MB/s) - `128mbfile.tgz' saved [127926272/127926272] Cocoa NOC Datapath: Cogent>Orlando Border Router (Mikrotik)>Licensed Microwave Link (300+Mb/s Capacity)>East Orange Router (Mikrotik)> Licensed Microwave Link (300+Mb/s Capacity)>Cocoa Router (Mikrotik)>Licensed Microwave Link (300+Mb/s Capacity)>Colo Router (Mikrotik)>NOC Router (Mikrotik)>HP Procurve Switch>Server Results: 2013-08-26 13:42:25 (398 KB/s) - `128mbfile.tgz' saved [127926272/127926272] Orlando-Cocoa NOC Datapath: Orlando Server>HP Procurve Switch>Orlando Border Router (Mikrotik)>Licensed Microwave Link (300+Mb/s Capacity)>East Orange Router (Mikrotik)> Licensed Microwave Link (300+Mb/s Capacity)>Cocoa Router (Mikrotik)>Licensed Microwave Link (300+Mb/s Capacity)>Colo Router (Mikrotik)>NOC Router(Mikrotik)>HP Procurve Switch>ServerResults: 2013-08-26 13:56:25 (3.31 MB/s) - `128mbfile.tgz' saved [134217728/134217728] Now, For the fun of it. I ran Iperf single TCP between our Cocoa and Orlando POP's. Just like the HTTP test above. (Server has a 100Mb/s port). It maxes out the port, Unlike the HTTP test. [root@ded01 ~]# iperf -c 208.90.219.18Cli ent connecting to 208.90.219.18, TCP port 5001TCP window size: 16.0 KByte (default)[ 3] local 206.208.56.130 port 47281 connected with 208.90.219.18 port 5001[ ID] Interval Transfer Bandwidth[ 3] 0.0-10.0 sec 114 MBytes 95.7 Mbits/sec Here's associated packet captures for each transfer. As well as full wget output and traceroutes for each test. As you can see, The tests crossing the wireless links show about 3x more TCP re-transmits/dup ACK's. But I'm not sure I'm sold this could show such a huge drop in throughput. Other then that, nothing really stands out to me as to why these transfers are so much slower. Intra-network iperf testing shows full throughput the whole way with single connection. As well as UDP testing. One thing to note is the Iperf testing has far less TCP re-transmit/dup acks then any of the HTTP testing, Crossing the same Microwave Links and routers. http://cdn.141networks.com/files/captures.zip I appreciate any insight anyone might have. Thanks! Nick Olsen Network Operations (855) FLSPEED x106
Re: TCP Performance
You didn't indicate this, but do you understand how TCP windowing works? This conversation can go two very different ways depending on the answer. To me, it looks like this is what you'd expect, and you need to fix your packet loss issues, which possibly might be QoS settings related (but it's hard to tell based on the information given). -Blake On Mon, Aug 26, 2013 at 2:07 PM, Nick Olsen wrote: > Greetings all, I've got an issue I was hoping to put a few more eyes on. > Here's the scenario. Downloading a file at our Border is multiple orders > of magnitude faster then a few hops out. Using the same 128MB test file, I > tested at two different locations. As well as between them. Using multiple > connections improves throughput, However it's the single stream issue we're > looking at right now. All testing servers in question are Centos Linux. > Orlando Datapath: Cogent>Orlando Border Router (Mikrotik)>HP Procurve > Switch> Server Results: 2013-08-29 05:04:09 (52.6 MB/s) - `128mbfile.tgz' > saved [127926272/127926272] > Cocoa NOC Datapath: Cogent>Orlando Border Router (Mikrotik)>Licensed > Microwave Link (300+Mb/s Capacity)>East Orange Router (Mikrotik)> Licensed > Microwave Link (300+Mb/s Capacity)>Cocoa Router (Mikrotik)>Licensed > Microwave Link (300+Mb/s Capacity)>Colo Router (Mikrotik)>NOC Router > (Mikrotik)>HP Procurve Switch>Server Results: 2013-08-26 13:42:25 (398 > KB/s) - `128mbfile.tgz' saved [127926272/127926272] > Orlando-Cocoa NOC Datapath: Orlando Server>HP Procurve Switch>Orlando > Border Router (Mikrotik)>Licensed Microwave Link (300+Mb/s Capacity)>East > Orange Router (Mikrotik)> Licensed Microwave Link (300+Mb/s Capacity)>Cocoa > Router (Mikrotik)>Licensed Microwave Link (300+Mb/s Capacity)>Colo Router > (Mikrotik)>NOC Router(Mikrotik)>HP Procurve Switch>ServerResults: > 2013-08-26 13:56:25 (3.31 MB/s) - `128mbfile.tgz' saved > [134217728/134217728] > Now, For the fun of it. I ran Iperf single TCP between our Cocoa and > Orlando POP's. Just like the HTTP test above. (Server has a 100Mb/s port). > It maxes out the port, Unlike the HTTP test. > [root@ded01 ~]# iperf -c > 208.90.219.18 > Cli > ent connecting to 208.90.219.18, TCP port 5001TCP window size: 16.0 KByte > (default)[ 3] > local 206.208.56.130 port 47281 connected with 208.90.219.18 port 5001[ > ID] > Interval Transfer Bandwidth[ 3] 0.0-10.0 sec 114 MBytes 95.7 > Mbits/sec > > Here's associated packet captures for each transfer. As well as full wget > output and traceroutes for each test. As you can see, The tests crossing > the wireless links show about 3x more TCP re-transmits/dup ACK's. But I'm > not sure I'm sold this could show such a huge drop in throughput. Other > then that, nothing really stands out to me as to why these transfers are so > much slower. Intra-network iperf testing shows full throughput the whole > way with single connection. As well as UDP testing. One thing to note is > the Iperf testing has far less TCP re-transmit/dup acks then any of the > HTTP testing, Crossing the same Microwave Links and routers. > http://cdn.141networks.com/files/captures.zip > I appreciate any insight anyone might have. Thanks! > Nick Olsen > Network Operations (855) FLSPEED x106 > > >
TCP Performance
Greetings all, I've got an issue I was hoping to put a few more eyes on. Here's the scenario. Downloading a file at our Border is multiple orders of magnitude faster then a few hops out. Using the same 128MB test file, I tested at two different locations. As well as between them. Using multiple connections improves throughput, However it's the single stream issue we're looking at right now. All testing servers in question are Centos Linux. Orlando Datapath: Cogent>Orlando Border Router (Mikrotik)>HP Procurve Switch> Server Results: 2013-08-29 05:04:09 (52.6 MB/s) - `128mbfile.tgz' saved [127926272/127926272] Cocoa NOC Datapath: Cogent>Orlando Border Router (Mikrotik)>Licensed Microwave Link (300+Mb/s Capacity)>East Orange Router (Mikrotik)> Licensed Microwave Link (300+Mb/s Capacity)>Cocoa Router (Mikrotik)>Licensed Microwave Link (300+Mb/s Capacity)>Colo Router (Mikrotik)>NOC Router (Mikrotik)>HP Procurve Switch>Server Results: 2013-08-26 13:42:25 (398 KB/s) - `128mbfile.tgz' saved [127926272/127926272] Orlando-Cocoa NOC Datapath: Orlando Server>HP Procurve Switch>Orlando Border Router (Mikrotik)>Licensed Microwave Link (300+Mb/s Capacity)>East Orange Router (Mikrotik)> Licensed Microwave Link (300+Mb/s Capacity)>Cocoa Router (Mikrotik)>Licensed Microwave Link (300+Mb/s Capacity)>Colo Router (Mikrotik)>NOC Router(Mikrotik)>HP Procurve Switch>ServerResults: 2013-08-26 13:56:25 (3.31 MB/s) - `128mbfile.tgz' saved [134217728/134217728] Now, For the fun of it. I ran Iperf single TCP between our Cocoa and Orlando POP's. Just like the HTTP test above. (Server has a 100Mb/s port). It maxes out the port, Unlike the HTTP test. [root@ded01 ~]# iperf -c 208.90.219.18Cli ent connecting to 208.90.219.18, TCP port 5001TCP window size: 16.0 KByte (default)[ 3] local 206.208.56.130 port 47281 connected with 208.90.219.18 port 5001[ ID] Interval Transfer Bandwidth[ 3] 0.0-10.0 sec 114 MBytes 95.7 Mbits/sec Here's associated packet captures for each transfer. As well as full wget output and traceroutes for each test. As you can see, The tests crossing the wireless links show about 3x more TCP re-transmits/dup ACK's. But I'm not sure I'm sold this could show such a huge drop in throughput. Other then that, nothing really stands out to me as to why these transfers are so much slower. Intra-network iperf testing shows full throughput the whole way with single connection. As well as UDP testing. One thing to note is the Iperf testing has far less TCP re-transmit/dup acks then any of the HTTP testing, Crossing the same Microwave Links and routers. http://cdn.141networks.com/files/captures.zip I appreciate any insight anyone might have. Thanks! Nick Olsen Network Operations (855) FLSPEED x106
Re: Troubleshooting TCP performance tutorial
Abel Alejandro wrote: Greetings, This past week I have been trying to find the root cause of tcp performance problems of a few clients that are using a third party metro Ethernet for transport. RFC2544 tests (Layer 2) and iperf using UDP give good symmetric performance almost 100% the speed of the circuit. However all kind of TCP tests result in some kind of asymmetrical deficiency, either the upstream or downstream of the client is hugely different. The latency is not a huge factor since all the metro Ethernet connections have less than 2 ms. So the question basically if is there a good tutorial or white paper for troubleshooting tcp with emphasis of using tools like Wireshark to debug and track this kind of problems. Regards, Abel. It might be worth your while to run the analysis found here: http://netalyzr.icsi.berkeley.edu/index.html
TCP Performance (NANOG Digest, Vol 32, Issue 60)
Sat, 18 Sep 2010 09:34:55 + nanog-requ...@nanog.org fuream loqour : >From: "Abel Alejandro" >Subject: Troubleshooting TCP performance tutorial > >This past week I have been trying to find the root cause of tcp >performance problems of a few clients that are using a third party metro >Ethernet for transport. RFC2544 tests (Layer 2) and iperf using UDP give >good symmetric performance almost 100% the speed of the circuit. However >all kind of TCP tests result in some kind of asymmetrical deficiency, Walking through the layers, if L2 checks out, & UDP checks out, but TCP does not, you might have a problem "up the stack", @ L4 with TCP. I don't know your test rig / setup, but a few ideas come to mind: - Mismatch in the TCP personality between the hosts - mixing Windows & some UNIX OS flavors, like Sun Solaris 2.8 or others can produce incredible degradation in streaming TCP performance either due to buggy TCP behavior, or one TCP stack supports options the other doesn't, like one side is doing Classic Reno TCP (max 64K) window, doesn't respect or process RFC1323 window-sizing options, but the other side does. It happens UNIX-to-UNIX too, and embedded devices usually don't have robust TCP/IP stacks as well (like hand-held barcode readers, etc.). Any client patch an OS recently? - Someone is doing QoS of some type either expressly or indirectly by effect. Stream / test TCP on multiple ports / sockets to multiple hosts, different OS combinations, or a known good combination - Knoppix Linux boot CD's are great for this kind of thing - send them to a client, pop 'em in, do some testing - once built a test CD for this, fun, gets rid of that long "troubleshoot loop". - MTU / other issue? Did your UDP stream pack full-size payloads? If not, walk up the size of the test packets to the maximum, see what happens. If it breaks somewhere, you have the beginning of an answer, etc. - Not thinking it's the client app since you mentioned you've run a whole set of TCP-based tests. /dmfh -- __| |_ __ / _| |_ 01100100 01101101 / _` | ' \| _| ' \01100110 01101000 \__,_|_|_|_|_| |_||_|dmfh(-2)dmfh.cx
Re: Troubleshooting TCP performance tutorial
On Saturday, September 18, 2010, Kevin Oberman wrote: > > You might look at http://fasterdata.es.net. A lot of it is aimed at very > large volume data transfers, but quite a bit is relevant to all TCP > issues. > -- > R. Kevin Oberman, Network Engineer > Energy Sciences Network (ESnet) > Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab) > E-mail: ober...@es.net Phone: +1 510 486-8634 > Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751 > > +1 fasterdata.es.net. Excellent resource. -brandon -- Brandon Galbraith US Voice: 630.492.0464
Re: Troubleshooting TCP performance tutorial
> Date: Fri, 17 Sep 2010 20:06:09 -0400 > From: "Abel Alejandro" > > Greetings, > > This past week I have been trying to find the root cause of tcp > performance problems of a few clients that are using a third party metro > Ethernet for transport. RFC2544 tests (Layer 2) and iperf using UDP give > good symmetric performance almost 100% the speed of the circuit. However > all kind of TCP tests result in some kind of asymmetrical deficiency, > either the upstream or downstream of the client is hugely different. The > latency is not a huge factor since all the metro Ethernet connections > have less than 2 ms. > > So the question basically if is there a good tutorial or white paper for > troubleshooting tcp with emphasis of using tools like Wireshark to debug > and track this kind of problems. > > Regards, > Abel. You might look at http://fasterdata.es.net. A lot of it is aimed at very large volume data transfers, but quite a bit is relevant to all TCP issues. -- R. Kevin Oberman, Network Engineer Energy Sciences Network (ESnet) Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab) E-mail: ober...@es.net Phone: +1 510 486-8634 Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751
Re: Troubleshooting TCP performance tutorial
http://www.amazon.com/Wireshark-Network-Analysis-Official-Certified/dp/1893939995 Spendy but looks good. I'll have to pick it up when the next consulting check comes in. Thanks! I was sad to see that Eric Hall's book was out of print. At least cheap used copies are available. I forgot my copy a few jobs ago... I'm sure someone is getting help from it. -- Joe Hamelin, W7COM, Tulalip, WA, 360-474-7474 On Fri, Sep 17, 2010 at 6:00 PM, Tim Eberhard wrote: > To add on to that. Recently Wireshark Network Analysis was released. It's an > excellent book covering wireshark and reading packet captures in general by > Laura Chappell. I just finished reading it and I have to say it's an > excellent book. Highly recommended. > > Between those two books I think you'll be very close to being a > wireshark/packet capture guru. > > I hope this helps, > -Tim Eberhard > > > On Fri, Sep 17, 2010 at 7:33 PM, Joe Hamelin wrote: >> >> In a situation like yours I found Internet Core Protocols: The >> Definitive Guide by Eric Hall an easy to read guide to insuring that >> what you are seeing via wireshark. I was able to find an issue with >> the DF bit in a load balancer that was causing confounding headaches >> in a network using wireshark and this book. >> >> Walk it through the syn-ack dance and don't trust that the devices are >> handling it correctly. Start at one end and work your way through and >> insure to YOUR satisfaction that every device proscribes to the >> protocol. Don't rush, don't jump to conclusions. Just follow the >> packet. That's the best advice I can give you. >> >> >> http://oreilly.com/catalog/9781565925724/ >> -- >> Joe Hamelin, W7COM, Tulalip, WA, 360-474-7474 >> >> >> >> On Fri, Sep 17, 2010 at 5:06 PM, Abel Alejandro >> wrote: >> > Greetings, >> > >> > This past week I have been trying to find the root cause of tcp >> > performance problems of a few clients that are using a third party metro >> > Ethernet for transport. RFC2544 tests (Layer 2) and iperf using UDP give >> > good symmetric performance almost 100% the speed of the circuit. However >> > all kind of TCP tests result in some kind of asymmetrical deficiency, >> > either the upstream or downstream of the client is hugely different. The >> > latency is not a huge factor since all the metro Ethernet connections >> > have less than 2 ms. >> > >> > So the question basically if is there a good tutorial or white paper for >> > troubleshooting tcp with emphasis of using tools like Wireshark to debug >> > and track this kind of problems. >> > >> > Regards, >> > Abel. >> > >> > >> > >> > >> > >> > >
Re: Troubleshooting TCP performance tutorial
To add on to that. Recently Wireshark Network Analysis was released. It's an excellent book covering wireshark and reading packet captures in general by Laura Chappell. I just finished reading it and I have to say it's an excellent book. Highly recommended. Between those two books I think you'll be very close to being a wireshark/packet capture guru. I hope this helps, -Tim Eberhard On Fri, Sep 17, 2010 at 7:33 PM, Joe Hamelin wrote: > In a situation like yours I found Internet Core Protocols: The > Definitive Guide by Eric Hall an easy to read guide to insuring that > what you are seeing via wireshark. I was able to find an issue with > the DF bit in a load balancer that was causing confounding headaches > in a network using wireshark and this book. > > Walk it through the syn-ack dance and don't trust that the devices are > handling it correctly. Start at one end and work your way through and > insure to YOUR satisfaction that every device proscribes to the > protocol. Don't rush, don't jump to conclusions. Just follow the > packet. That's the best advice I can give you. > > > http://oreilly.com/catalog/9781565925724/ > -- > Joe Hamelin, W7COM, Tulalip, WA, 360-474-7474 > > > > On Fri, Sep 17, 2010 at 5:06 PM, Abel Alejandro > wrote: > > Greetings, > > > > This past week I have been trying to find the root cause of tcp > > performance problems of a few clients that are using a third party metro > > Ethernet for transport. RFC2544 tests (Layer 2) and iperf using UDP give > > good symmetric performance almost 100% the speed of the circuit. However > > all kind of TCP tests result in some kind of asymmetrical deficiency, > > either the upstream or downstream of the client is hugely different. The > > latency is not a huge factor since all the metro Ethernet connections > > have less than 2 ms. > > > > So the question basically if is there a good tutorial or white paper for > > troubleshooting tcp with emphasis of using tools like Wireshark to debug > > and track this kind of problems. > > > > Regards, > > Abel. > > > > > > > > > > > >
Re: Troubleshooting TCP performance tutorial
In a situation like yours I found Internet Core Protocols: The Definitive Guide by Eric Hall an easy to read guide to insuring that what you are seeing via wireshark. I was able to find an issue with the DF bit in a load balancer that was causing confounding headaches in a network using wireshark and this book. Walk it through the syn-ack dance and don't trust that the devices are handling it correctly. Start at one end and work your way through and insure to YOUR satisfaction that every device proscribes to the protocol. Don't rush, don't jump to conclusions. Just follow the packet. That's the best advice I can give you. http://oreilly.com/catalog/9781565925724/ -- Joe Hamelin, W7COM, Tulalip, WA, 360-474-7474 On Fri, Sep 17, 2010 at 5:06 PM, Abel Alejandro wrote: > Greetings, > > This past week I have been trying to find the root cause of tcp > performance problems of a few clients that are using a third party metro > Ethernet for transport. RFC2544 tests (Layer 2) and iperf using UDP give > good symmetric performance almost 100% the speed of the circuit. However > all kind of TCP tests result in some kind of asymmetrical deficiency, > either the upstream or downstream of the client is hugely different. The > latency is not a huge factor since all the metro Ethernet connections > have less than 2 ms. > > So the question basically if is there a good tutorial or white paper for > troubleshooting tcp with emphasis of using tools like Wireshark to debug > and track this kind of problems. > > Regards, > Abel. > > > > >
Troubleshooting TCP performance tutorial
Greetings, This past week I have been trying to find the root cause of tcp performance problems of a few clients that are using a third party metro Ethernet for transport. RFC2544 tests (Layer 2) and iperf using UDP give good symmetric performance almost 100% the speed of the circuit. However all kind of TCP tests result in some kind of asymmetrical deficiency, either the upstream or downstream of the client is hugely different. The latency is not a huge factor since all the metro Ethernet connections have less than 2 ms. So the question basically if is there a good tutorial or white paper for troubleshooting tcp with emphasis of using tools like Wireshark to debug and track this kind of problems. Regards, Abel.