Re: US .mil blocking in Japan
On Tue, Mar 15, 2011 at 09:49:56PM -0600, ryanL wrote: should i be surprised that this hasn't been discussed much? anyone care to elaborate and/or expand on the real telecom damage done in japan? What's to be surprised about? The US military is temporarily blocking access to certain high-traffic web sites on its networks. This obviously affects only those users on DoD networks. What damage are you referring to? --Jeff
Re: US .mil blocking in Japan
On Wed, Mar 16, 2011 at 09:14:13AM -0700, andrew.wallace wrote: This isn't the rhetoric of a super power, more like one of a university campus. [...] It strikes me straight away as amateurish to be blocking web sites in able to have enough bandwidth for operational purposes. On the contrary, it's entirely plausible that US forces assisting with the recovery are (1) using more communications resources than normal, and (2) relying on infrastructure that's operating in a degraded state due to fiber or power issues. If so, it's entirely reasonable to put limits on bandwidth-hungry but non-essential applications as a precautionary measure. Here's an excerpt from http://www.nextgov.com/nextgov/ng_20110314_9111.php?oref=topnews: Military units operating in Japan face bandwidth shortages and network limitations that inhibit communications and command and control, Defense sources told Nextgov. Misawa Air Base, located on the northeast tip of Honshu, warned its personnel on a blog post Friday that the Defense Switched Network, which handles voice calls, was in backup mode and had only limited capacity, a fact confirmed by a Pentagon source Monday. The blog post added, We have a number of connectivity issues. Internet has been up and down due to our connections through other places in Japan. For example, Yokota [Air Base] and several other locations are having issues because we all have power and connectivity issues right now. The Pentagon also took the extraordinary step of blocking access to a range of commercial websites to ensure that its networks have enough bandwidth to support mission-essential communications, Nextgov learned. This move, a military source told Nextgov, possibly indicates one or more undersea cables used by military networks were damaged by the earthquake. --Jeff
Re: Router only speaks IGP in BGP network
On Sat, Dec 25, 2010 at 08:52:42AM -0500, ML wrote: If you're only redistributing 10 prefixes into OSPF? Problem? I know I'm a little late to this thread, but figured I'd point out one reason why this can be very dangerous: In IOS, you use a route-map to control redistribution between protocols. For example, if you want to redist just those BGP prefixes tagged with a specific community into OSPF, you will probably configure something that looks like this: route-map bgp-to-ospf permit 10 match community $COMMUNITY ! route-map bgp-to-ospf deny 20 ! router ospf $PID redistribute bgp $ASN subnets route-map bgp-to-ospf Now, consider the following failure scenarios: 1. Someone typo's a BGP config elsewhere in your network and attaches $COMMUNITY to a whole bunch more routes... say, all 350k being sent by your upstream provider. *oops* 2. An engineer thinks that there's something wrong with the redistribution and decides to temporarily disable it as part of the troubleshooting process. He types the following: conf t router ospf $PID no redistribute bgp $ASN subnets route-map bgp-to-ospf *boom* He just dumped all BGP routes into OSPF, due to the way IOS parses the command: it removes the route-map but leaves the redistribution intact. To be fair, Cisco does provide you with tools to mitigate this risk (see the redistribute maximum-prefix command) but the point is that this is a fairly easy mistake to make. At the end of the day, the reason that many folks advise against the redistribution of BGP into an IGP is that it sets the stage for a seemingly insignificant mistake to cause a not-so-insignificant outage. --Jeff
Re: Introducing draft-denog-v6ops-addresspartnaming
[ Meant to send this to the list and not directly to Richard. ] On Fri, Nov 19, 2010 at 03:07:40AM +0100, Richard Hartmann wrote: If any of you have any additional suggestions, you are more than welcome to share them. I heard hexquad somewhere awhile back and have been using it since... looking over the other options present in your poll, I think I still prefer it, but I could live with either hextet or simply quad as well. --Jeff
Re: RIP Justification
On Fri, Oct 01, 2010 at 04:28:30PM +, Tim Franklin wrote: Leaf-node BGP config is utterly trivial [...] The Enterprise guys really need to get out of the blanket BGP is scary mindset It's not just enterprise mindset. Over the years I've seen a lot of deployed gear that either didn't support BGP at all or for which it was a significant extra cost. At least in the past this applied to many firewalls and load-balancers, and until recently, even one of the major CMTS vendors didn't support BGP. I agree that edge-node BGP is simple, but finding gear that supports it isn't necessarily so. --Jeff
Re: Time Warner/Road Runner issues in the Mid West
On Fri, Oct 09, 2009 at 07:30:19AM -0700, Mike Maberry wrote: Is anyone else seeing connectivity issues to the internet using Time Warner/Road Runner in the Mid West? Kansas City and Wisconsin seem to be unable to access sites on the west coast... Mike, There is an ongoing issue that our ops folks are currently troubleshooting. I don't have any details at this time, but if you've got a traceroute or other details on the specific issue that you're seeing, feel free to forward to me directly and I'll make sure it gets to the right parties here. Thanks, --Jeff
Re: Data Center testing
On Tue, Aug 25, 2009 at 10:45:07PM -0500, Frank Bulk - iName.com wrote: There's more to data integrity in a data center (well, anything powered, that is) than network configurations. Understood and agreed. My point was that induced failure testing isn't the right way to catch incorrect or unauthorized config changes, which is what I understood the original poster to have said was his problem. My apologies if I misunderstood what he was asking. So while your analogy emphasizes the importance of having good processes in place to catch the problems up front, it doesn't eliminate throwing the switch. Yup, and it's precisely why I suggested using planned maintenance events as one way of doing at least limited failure testing. --Jeff
Re: Data Center testing
On Mon, Aug 24, 2009 at 09:38:38AM -0400, Dan Snyder wrote: We have done power tests before and had no problem. I guess I am looking for someone who does testing of the network equipment outside of just power tests. We had an outage due to a configuration mistake that became apparent when a switch failed. It didn't cause a problem however when we did a power test for the whole data center. Dan, With all due respect, if there are config changes being made to your devices that aren't authorized or in accordance with your standards (you *do* have config standards, right?) then you don't have a testing problem, you have a data integrity problem. Periodically inducing failures to catch them is sorta like using your smoke detector as an oven timer. There are several tools that can help in this area; a good free one is rancid [1], which logs in to your routers and collects copies of configs and other info, all of which gets stored in a central repository. By default, you will be notified via email of any changes. An even better approach than scanning the hourly config diff emails is to develop scripts that compare the *actual* state of the network with the *desired* state and alert you if the two are not in sync. Obviously this is more work because you have to have some way of describing the desired state of the network in machine-parsable format, but the benefit is that you know in pseudo-realtime when something is wrong, as opposed to finding out the next time a device fails. Rancid diffs + tacacs logs will tell you who made the changes, and with that info you can get at the root of the problem. Having said that, every planned maintenance activity is an opportunity to run through at least some failure cases. If one of your providers is going to take down a longhaul circuit, you can observe how traffic re-routes and verify that your metrics and/or TE are doing what you expect. Any time you need to load new code on a device you can test that things fail over appropriately. Of course, you have to willing to just shut the device down without draining it first, but that's between you and your customers. Link and/or device failures will generate routing events that could be used to test convergence times across your network, etc. The key is to be prepared. The more instrumentation you have in place prior to the test, the better you will be able to analyze the impact of the failure. An experienced operator can often tell right away when looking at a bunch of MRTG graphs that something doesn't look right, but that doesn't tell you *what* is wrong. There are tools (free and commercial) that can help here, too. Have a central syslog server and some kind of log reduction tool in place. Have beacons/probes deployed, in both the control and data planes. If you want to record, analyze, and even replay routing system events, you might want to take a look at the Route Explorer product from Packet Design [2]. You said switch failure above, so I'm guessing that this doesn't apply to you, but there are also good network simulation packages out there. Cariden [3] and WANDL [4] can build models of your network based on actual router configs and let you simulate the impact of various scenarios, including device/link failures. However, these tools are more appropriate for design and planning than for catching configuration mistakes, so they may not be what you're looking for in this case. --Jeff [1] http://www.shrubbery.net/rancid/ [2] http://www.packetdesign.com/products/rex.htm [3] http://www.cariden.com/ [4] http://www.wandl.com/html/index.php
Re: ISP BGP Resources
On Fri, Jul 10, 2009 at 08:17:43AM -0400, Babak Pasdar wrote: Are there any resources (books, web sites, mailing lists, etc..) that anyone can recommend? Richard Steenbergen did a nice preso on this subject a couple years ago: http://www.nanog.org/meetings/nanog40/presentations/BGPcommunities.pdf --Jeff
Re: Sprint v. Cogent, some clarity facts
On Mon, Nov 03, 2008 at 04:34:16PM -0200, Nicolas Antoniello wrote: Sorry for my possible ignorance, but could you explain me what are you calling transit-free? Transit-free means that you don't pay anyone else to reach some 3rd-party network. In other words, if I'm Sprint, I don't pay UUNET to get to X. Either X connects directly with me or X pays someone else to get to me. If I can make that claim for all values of X, then I am transit-free. Note that while I don't pay another network for access to its *peers* (that's transit) I might pay for access to its customers. This is typically called paid peering or settlement-based peering, but sometimes it can just be plain transit that's modified with communities to look like peering. To add to the confusion, the latter case might be described differently by both parties; the seller probably says X is a transit customer of mine, and the buyer says I have peering with Y, and in this case, neither one is lying (mostly). If you didn't see the reference a month or so ago when Paul sent it, the following link might be interesting to you: http://arstechnica.com/guides/other/peering-and-transit.ars --Jeff
Re: Is it time to abandon bogon prefix filters?
On Mon, Aug 18, 2008 at 09:51:20AM +0100, [EMAIL PROTECTED] wrote: m4 is a macro processor that you probably should not bother learning since you can do everything that it does by using Python Oh, Abley is gonna have fun with this... and for the record, my money is on Joe. He could probably implement python *IN* m4 if you offered enough beer! --Jeff
Re: Traceroute and random UDP ports
On Wed, Aug 13, 2008 at 07:56:53AM -0500, John Kristoff wrote: Also, why do we increase the UDP port number with each subsequent traceroute packet that is sent? I don't know definitively, but I have an of educated guess From /usr/src/contrib/traceroute/traceroute.c: /* * Notes * - * [...] * The udp port usage may appear bizarre (well, ok, it is bizarre). * The problem is that an icmp message only contains 8 bytes of * data from the original datagram. 8 bytes is the size of a udp * header so, if we want to associate replies with the original * datagram, the necessary information must be encoded into the * udp header (the ip id could be used but there's no way to * interlock with the kernel's assignment of ip id's and, anyway, * it would have taken a lot more kernel hacking to allow this * code to set the ip id). So, to allow two or more users to * use traceroute simultaneously, we use this task's pid as the * source port (the high bit is set to move the port number out * of the likely range). To keep track of which probe is being * replied to (so times and/or hop counts don't get confused by a * reply that was delayed in transit), we increment the destination * port number before each probe. * [...] * -- Van Jacobson ([EMAIL PROTECTED]) * Tue Dec 20 03:50:13 PST 1988 */ --Jeff