DSL and/or Routing Problems
Greetings NANOGers, Yesterday we starting noticing long delays on an ADSL connection. I spent most of the day trying to track down the problem and getting no where. Telco says they do not detect any problem on the line... so I am kind of lost. Anyone here have any ideas? Here are the specifics: This connection uses a Cisco 827 ADSL router and has several static IPs. All IPs show identical delays. Using other circuits between the same two locations, we do not see any delays. Normally on this DSL connection, local can ping remote with packet transit times around 60-70ms. Here is what we are seeing now: # ping -s SOMEHOST 68 25; sleep 1; ping -s SOMEHOST 68 25 PING SOMEHOST: 68 data bytes 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=0. time=105. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=1. time=9132. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=2. time=8132. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=3. time=7132. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=4. time=6132. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=5. time=5133. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=6. time=4133. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=7. time=3133. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=8. time=2133. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=9. time=1133. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=10. time=133. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=11. time=104. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=12. time=110. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=13. time=109. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=14. time=112. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=15. time=106. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=16. time=114. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=17. time=107. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=18. time=109. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=19. time=106. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=20. time=112. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=21. time=106. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=22. time=108. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=23. time=106. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=24. time=110. ms SOMEHOST PING Statistics 25 packets transmitted, 25 packets received, 0% packet loss round-trip (ms) min/avg/max = 104/1918/9132 PING SOMEHOST: 68 data bytes 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=0. time=112. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=1. time=9131. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=2. time=8132. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=3. time=7132. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=4. time=6132. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=5. time=5132. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=6. time=4133. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=7. time=3132. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=8. time=2133. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=9. time=1133. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=10. time=133. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=11. time=111. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=12. time=106. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=13. time=109. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=14. time=116. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=15. time=108. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=16. time=107. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=17. time=113. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=18. time=106. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=19. time=107. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=20. time=108. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=21. time=108. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=22. time=105. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=23. time=109. ms 76 bytes from SOMEHOST (w.x.y.z): icmp_seq=24. time=106. ms SOMEHOST PING Statistics 25 packets transmitted, 25 packets received, 0% packet loss round-trip (ms) min/avg/max = 105/1918/9131 What really has me bugged is the pattern shown by the first dozen packets... why the relatively quick first time, followed by a long but decreasing delay which repeats every time you restart the ping (that's why I provided 2 samples)? Despite the fact that Telco says there are not any line problems, we are seeing a change in DSL performance compared to our benchmark. When we first started noticing the problem yesterday, both in and out connections were using the Fast path, but compared to the benchmark, the inbound speed had dropped to 576 and the Capacity had jumped to 99%, plus we had some RS and CRC errors on both in and out connections. Later in the day, the connection switched from using the Fast path to the Interleave path (we did nothing on our end to cause this to change) and the performance settled down to what is shown below under DSL NOW. DSL BENCHMARK: == ATU-R (DS) ATU-C (US) Capacity Used: 72%
Re: DSL and/or Routing Problems
[EMAIL PROTECTED] wrote: This connection uses a Cisco 827 ADSL router and has several static IPs. All IPs show identical delays. Using other circuits between the same two locations, we do not see any delays. What's the weather like? ;-) See if you can get the ADSL router to give you upstream/downstream noise margins and any other userful reporting ... AR Driver Counters Display : TX :|packets: 8597915 = direct: 2923483 + qued: 5674434 | = oamF4: 0 + oamF5: 0 + others |fail count = chNoEr: 0 + dropped: 0 |txMissIsr= 0, queCnt= 0, txOnGoing= 0 RX :|packets: 8924470 = toATM: 8919249 + loopback: 0 + errors | , where oamF4: 0, oamF5: 0 |errors = crc: 5069 + mbuf: 0 + len: 0 + pad: 0 + strayed: 151 |rxMissIsr= 0, queCnt= 0, nonAA= 0, sramErr= 0, reqSramMax= 6 |dummyIsr = 256833, fpgaIsr = 14826785 VC( 0 to 3 ) : 08924319 VC( 4 to 7 ) : VC( 8 to 11 ) : VC( 12 to 15 ) : 0151 Upstream Noise Margin relative capacity occupation: 78% noise margin upstream: 11.0 db output power downstream: 16.0 dbm attenuation upstream: 31.5 db carrier load: number of bits per symbol(tone) tone 0- 31: 00 00 00 04 67 77 66 65 66 66 66 66 55 54 43 00 tone 32- 63: 00 00 00 44 55 66 66 66 66 66 66 66 66 66 26 66 tone 64- 95: 66 65 55 54 45 55 55 44 44 44 44 44 44 43 33 22 tone 96-127: 22 22 02 22 22 20 00 00 00 00 00 00 00 00 00 00 tone 128-159: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 tone 160-191: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 tone 192-223: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 tone 224-255: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Downstream Noise Margin relative capacity occupation: 95% noise margin downstream: 6.5 db output power upstream: 12.0 dbm attenuation downstream: 66.5 db carrier load: number of bits per symbol(tone) tone 0- 31: 00 00 00 04 67 77 66 65 66 66 66 66 55 54 43 00 tone 32- 63: 00 00 00 44 55 66 66 66 66 66 66 66 66 66 26 66 tone 64- 95: 66 65 55 54 45 55 55 44 44 44 44 44 44 43 33 22 tone 96-127: 22 22 02 22 22 20 00 00 00 00 00 00 00 00 00 00 tone 128-159: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 tone 160-191: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 tone 192-223: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 tone 224-255: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Re: DSL and/or Routing Problems
[EMAIL PROTECTED] wrote: Greetings NANOGers, Yesterday we starting noticing long delays on an ADSL connection. snip Assuming it is not your ISP or that the telco is the ISP. Dont believe them. Tell them to reset the port. Tell them to change the pairs. Tell them to switch your line to a different port on the dslam. Tell them to put you into a different CO. Tell them to dispatch a technician to test your line at the nid. Get a FTP server with good connectivity on the internet and upload/download to it, measuring your speed. Show the telco low bandwidth and packet loss. Do some flood pinging (carefully). Test the line with a cheap linksys or netgear or smc or dlink or similar broadband residential router with ADSL modem (or even software [google for raspppoe for windows, linux has pppoe software available as well - if thats what your setup uses]). Spend a few dollars and get ADSL on another phone line if that all does not work. For the money they make off a ADSL line, a Telco is unlikely to do more than run the standard automated web testing thingy and say Everything fine here! and hope you dont call back and cost them more. That makes sense. The more support time and expertise expended on you, the less profit generated for them by your business. I cant count the number of Tests perfectly! that get resolved mysteriously inside the telco after some more harrasment. Furthermore, our experience on average is that the more the line costs per month, the better service you get on it. Typicaly with any large amount of circuits, you will find the right people in the telco who actually give a damn about you and can get things done Joe
Re: UPS and generator interaction?
Oh, another detail. Some 98% of the UPSi around are standby units. They sit and trickle their batteries until the line fails, then quickly kick in. They take 'n' hours to recharge when the line returns. But there exist another genus. These 'full time' units ALWAYS run the load from the UPS inverter; and have a big AC line-DC battery charger -- big enough so as to keep up. The advantage is a very high degree of line isolation. Any surge, sag, glitch, spike may affect the AC-DC side of the equation, but will have to get past the battery plant and inverter for the load to see it. Note it does not even care what the input frequency is. I know of one large unit that was sent to Mexico City. At the time, it was 50Hz, but there was some announced plan for the city to go to 60, Real Soon Now. The UPS battery charger ran on anything between say 40-70 Hz, but the inverter made 60.0 Hz, period. Such units are not common or cheap. In the low end, Sola Corp used to make some in the low (1-3?) KVA range. Top end, how many KVA do you need? I think the one going to Mexico City was 500KVA. If your power is really rotten [Here, Guyana comes to mind...] you may want to spend more up front. Side thought, but not a NANOG topic. What in your data center really cares if your generator puts out 57 or 63 Hz, not 60.0? Why? -- A host is a host from coast to [EMAIL PROTECTED] no one will talk to a host that's close[v].(301) 56-LINUX Unless the host (that isn't close).pob 1433 is busy, hung or dead20915-1433
Re: UPS and generator interaction?
David Lesher wrote: Side thought, but not a NANOG topic. What in your data center really cares if your generator puts out 57 or 63 Hz, not 60.0? Why? Some clocks get a little nutso. Because they are powered by AC synchronous motors with gearing that assumes 60 Hz. (or 50 Hz, as the case might be.) Some fans and other devices also use synchronour or induction motors with similar engineering assumptions. -- Requiescas in pace o email
Re: UPS and generator interaction?
Line Interactive APC UPS's don't do well with about 160 volts and 83 Hz input. Apparently, you can only interact so much Interacting with the line requires battery power. They can't charge and supply power at the same time, unlike traditional online UPSes. Once their batteries (LI type) go, they can't do anything. Honest to God, APC replaced it because they said it should have tripped off line with the line being that out of whack ... APC is very good about replacing equipment that might be their fault. Even after warranty. I've never seen another company with as smart, techncial staff available quickly. Too bad no one has any real experience with their big gear to see if its supported as well. Deepak Jain AiNET
Re: UPS and generator interaction?
Once upon a time, Deepak Jain [EMAIL PROTECTED] said: APC is very good about replacing equipment that might be their fault. Even after warranty. I've never seen another company with as smart, techncial staff available quickly. Too bad no one has any real experience with their big gear to see if its supported as well. We've got a APC power strip (20A vertical zero U mount with 14 outlets) that failed (possibly from heat). The circuit breaker tripped and would not reset. Their response was that we had too much plugged in (never mind that we switched to a different 20A strip and it is working fine on a 20A breaker) and that while they would replace it, if the replacement failed they would NOT replace it. I prefer APC for small UPSes, but I'm not impressed by support on a simple power strip. -- Chris Adams [EMAIL PROTECTED] Systems and Network Administrator - HiWAAY Internet Services I don't speak for anybody but myself - that's enough trouble.
Re: DSL and/or Routing Problems
DSL BENCHMARK: == ATU-R (DS) ATU-C (US) Capacity Used: 72% 21% Interleave FastInterleave Fast Speed (kbps): 0 960 0 256 Reed-Solomon EC: 00 0 0 CRC Errors: 00 0 0 Header Errors:00 0 0 Bit Errors: 00 BER Valid sec:00 BER Invalid sec: 00 DSL NOW: ATU-R (DS) ATU-C (US) Capacity Used: 94% 63% Interleave FastInterleave Fast Speed (kbps): 7360 256 0 Reed-Solomon EC: 990 4 0 CRC Errors: 40 1 0 Header Errors:30 0 0 Bit Errors: 00 BER Valid sec:00 BER Invalid sec: 00 You've gone from fast path to interleaved. Interleaved can inject up to 64ms of latency, in each direction, ontop of the normal line latency. (IE say 12ms loop time, interleaved can bump that up to 140ms latency.) Interleaved is used to trade latency for line stability. I'm not sure of the specifics on that however. Basically, you set your latency tolerance on the dslam, up to 64ms for up and downstream, and dependant on line conditions, your latency will vary between base loop latency and the max allowed by your tolerance. On a good line, you won't see any latency injected, a poor line will run right up to the tolerance and still retrain due to errors. You need to ask the telco why they've changed you from fast path, and request that you get put back to a fast path config. You MAY be able to restrict your dsl modem to training fast path only if they have your line set to auto for signaling. Joshua Coombs
MLPPP Follow Up - How we fixed the problem
I asked the group some time ago about some problems we were seeing with MLPPP on our Cisco 7513s. I have had 5 or 6 people contact me off list to ask how we solved the problem, so I figured I would post our solution to the group. I am sure there may be other fixes, however this works great for us and we have not had a problem in months since converting all MLPPP customers over. Basically we shut down MLPPP and went with (ip load-sharing per-packet) Here is what our config looks like: interface Serial1/0/0/13:0 description Customer #4144 (San Diego) #1 UPDATE [4144] ip address X.X.X.X 255.255.255.252 no ip directed-broadcast ip load-sharing per-packet ip route-cache distributed no cdp enable interface Serial2/1/0/14:0 description Customer #4144 (San Diego) #2 UPDATE [4144] ip address X.X.X.X 255.255.255.252 no ip directed-broadcast ip load-sharing per-packet ip route-cache distributed no cdp enable ip route X.X.X.X 255.255.255.252 Serial1/0/0/13:0 ip route X.X.X.X 255.255.255.252 Serial2/1/0/14:0 The only problem that we ran into was that we had to use the Serial designator of the interface in our route statement otherwise it will not work (or at least it did not for us). Since converting our customers (all MLPPP customers) to ip load-sharing per-packet - we have had no further problems. Hope this helps someone ** Richard J. Sears Vice President American Digital Network [EMAIL PROTECTED] http://www.adnc.com 858.576.4272 - Phone 858.427.2401 - Fax I fly because it releases my mind from the tyranny of petty things . . Work like you don't need the money, love like you've never been hurt and dance like you do when nobody's watching.
Re: DSL and/or Routing Problems
On Tue, 30 Mar 2004, Joe Maimon wrote: : [EMAIL PROTECTED] wrote: : Greetings NANOGers, : : Yesterday we starting noticing long delays on an ADSL connection. : snip : Assuming it is not your ISP or that the telco is the ISP. : Dont believe them. Tell them to reset the port. Tell them to change the NETAT! Never Ever Trust A Telco!test, test and test some more on your side and then demand they do the same. I have even had to troubleshoot their network. I did the above and then when it still didn't work everyone (my boss, my boss' boss, data center techs and the same level of telco folks all got on a conference call for The Big Blame Party. It was, once again, their fault. scott : pairs. Tell them to switch your line to a different port on the dslam. : Tell them to put you into a different CO. Tell them to dispatch a : technician to test your line at the nid. Get a FTP server with good : connectivity on the internet and upload/download to it, measuring your : speed. Show the telco low bandwidth and packet loss. Do some flood : pinging (carefully). : : Test the line with a cheap linksys or netgear or smc or dlink or similar : broadband residential router with ADSL modem (or even software [google : for raspppoe for windows, linux has pppoe software available as well - : if thats what your setup uses]). : : Spend a few dollars and get ADSL on another phone line if that all does : not work. : : For the money they make off a ADSL line, a Telco is unlikely to do more : than run the standard automated web testing thingy and say Everything : fine here! and hope you dont call back and cost them more. That makes : sense. The more support time and expertise expended on you, the less : profit generated for them by your business. : : I cant count the number of Tests perfectly! that get resolved : mysteriously inside the telco after some more harrasment. Furthermore, : our experience on average is that the more the line costs per month, the : better service you get on it. Typicaly with any large amount of : circuits, you will find the right people in the telco who actually give : a damn about you and can get things done : : Joe : :
Re: UPS and generator interaction?
On Tue, 30 Mar 2004, David Lesher wrote: Side thought, but not a NANOG topic. What in your data center really cares if your generator puts out 57 or 63 Hz, not 60.0? Why? Some UPSes such as the Best FerrUPS series and other voltage regulators and line conditioners that use a ferro-resonant transformer where there's an L-C tuned circuit as part of the power transformer. Other motor loads may care to some extent. Analog electric clocks will run slow or fast, no big deal. Lower freqencies are harder on marginally designed transformers which may not have enough core material. -- Jay Hennigan - CCIE #7880 - Network Administration - [EMAIL PROTECTED] WestNet: Connecting you to the planet. 805 884-6323 WB6RDV NetLojix Communications, Inc. - http://www.netlojix.com/
Re: UPS and generator interaction?
We've got a APC power strip (20A vertical zero U mount with 14 outlets) that failed (possibly from heat). The circuit breaker tripped and would not reset. Their response was that we had too much plugged in (never mind that we switched to a different 20A strip and it is working fine on a 20A breaker) and that while they would replace it, if the replacement failed they would NOT replace it. I prefer APC for small UPSes, but I'm not impressed by support on a simple power strip. That sounds like fairly decent support for a power strip. I would have let them replace it and see if the same thing happened. If you were overloading it, the breaker should be resetable. However, a cheaper power strip may not have as touchy a breaker or allow more power than it should. They have replaced full MATRIX 5000 units, batteries and all for me a few times without quibbling and without any kind of extended service contract on them. I'm sure someone will have already suggested that you put an ammeter on the load to see what is actually happening. While I have never used APC's Zero U strip, I have heard good things about Baytech's version (which I've also not used). YMMV. Deepak
publishing venue
I've been talking to the folks at Usenix about a venue for papers of interest to this group. They've very eager to have such papers at the LISA (Large Installation System Administration) conference. Timing is tight for this year -- the deadline is in three weeks ( http://www.usenix.org/events/lisa04/ ). For those who aren't familiar with LISA, this is a conference with that publishes proceedings of refereed papers. The success, of course, depends on members of this community submitting papers, this year and next -- that's what will make the conference interesting to this community. I'm working on a journal, too. --Steve Bellovin, http://www.research.att.com/~smb
Re: publishing venue
it is interesting that the theme for this year's lisa conference is System Administration Reality - Automation, Configuration, and Users some concepts near and dear to many of our hearts in the realm of network, as opposed to system, administration. randy
Re: DSL and/or Routing Problems
ping did _this_ Ping is not very informative or accurate. If you run a traceroute, which is also not very accurate, you can get some idea about where the delay appears to be. Is it the DSL segment? Is it somewhere else that traceroute can show you? The nice thing about delays that are this long is that 9000 ms is long enough it won't just be lost in the noise... It wouldn't be surprising if it's in your DSL, and if your DSL has changed to a lower speed (which looks like it might have happened), then maybe something _is_ wrong with your DSL, or maybe the slower speed is causing traffic backups that weren't a problem when you were getting 512 kbps, or you're getting TCP retransmissions, but maybe the problem is somewhere else in the network.
Re: UPS and generator interaction?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Yo Brian! On Mon, 29 Mar 2004, Laurence F. Sheldon, Jr. wrote: Brian (nanog-list) wrote: Does anyone know of a way to get a UPS to trigger a generator to start, and to switch over to the generator power automatically or does this type of thing just not exist? Find somebody with Internet Access and a browser--go to Google.com, enter generator backup ups in the box. Otherwise stroll down to Home Depot. My HD sells a full kit, includeing generator. Then hire an electrician to install it since the code requirements are not obvious. RGDS GARY - --- Gary E. Miller Rellim 20340 Empire Blvd, Suite E-3, Bend, OR 97701 [EMAIL PROTECTED] Tel:+1(541)382-8588 Fax: +1(541)382-8676 -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.3 (GNU/Linux) iD8DBQFAaLr+8KZibdeR3qURAn2YAJ4/JP2Bix59XCBYmPA4KZMeNxca+ACfTVkq x64tzuQcW1LKy+pLAq+161Q= =gZhe -END PGP SIGNATURE-
Re: DSL and/or Routing Problems
On Tue, 30 Mar 2004, Stewart, William C (Bill), RTSLS wrote: ping did _this_ Ping is not very informative or accurate. If you run a traceroute, which is also not very accurate, Get the best of both tools and use mtr (assuming unix-like platform). There are similar tools for windows (pingplotter?). This thread reminds me of my own DSL, which rides the ILEC's network and is handed off to $work at the CO as an ATM PVC. For years, my DSL service has osciliated from fine (20-30ms ping times) to not good (200-300ms) to unusable (=1000ms ping times). It seems to work fine for months, then get bad to really bad for days or weeks at a time. I've replaced CPE several times, and even keep 2 totally different brand/model routers at the house, just in case (so when I call the DSG, I can say yes, not only have I power cycled it, I've replaced the router). I've spent considerable time on the phone with the ILEC. Most calls, they claim there's nothing wrong. A few times, they've admitted it's a known problem with the lt card, not that that means much to me, and resetting it often makes things better. -- Jon Lewis [EMAIL PROTECTED]| I route Senior Network Engineer | therefore you are Atlantic Net| _ http://www.lewis.org/~jlewis/pgp for PGP public key_
Re: publishing venue
I've been talking to the folks at Usenix about a venue for papers of interest to this group. They've very eager to have such papers at the LISA (Large Installation System Administration) conference. Timing is tight for this year -- the deadline is in three weeks ( http://www.usenix.org/events/lisa04/ ). For those who aren't familiar with LISA, this is a conference with that publishes proceedings of refereed papers. The success, of course, depends on members of this community submitting papers, this year and next -- that's what will make the conference interesting to this community. There is a reasonable level of overlap into the network operator field. LISA and USENIX are both great conferences to attend if you want to know what to be planning your network for. These are the guys running the big server platforms doing interesting things with applications and I've found it very useful to review the papers out of both of these meetings [and also Apache-Con]. Regards, Neil.