Re: Curiosity about AS3356 L3/CenturyLink network resiliency (in general)

Mike Hammett Thu, 17 May 2018 06:25:50 -0700

I often question why\how people build networks the way they do. There's some 
industry hard-on with having a few ginormous routers instead of many smaller 
ones. I've learned that when building Internet Exchanges, the number of 
networks that don't have BGP edge routers in major markets where they have a 
presence is quite a bit larger than one would expect. I heard a podcast once (I 
forget if it was Packet Pushers or Network Collective) postulating that the 
reason why everything runs back to a few big ass routers is that someone 
decided to spend a crap-load of money on big ass routers for bragging rights, 
so now they have to run everything they can through them to A) "prove" their 
purchase wasn't foolish and B) because they now can't afford to buy anything 
else.


There's no reason why Tampa doesn't have a direct L3 adjacency to Miami, 
Atlanta, Houston, and Charlotte over diverse infrastructure to all four. 
Obviously there's room to add\drop from that list, but it gets the point 
across. 



----- 
Mike Hammett 
Intelligent Computing Solutions 
http://www.ics-il.com 

Midwest-IX 
http://www.midwest-ix.com 

----- Original Message -----

From: "David Hubbard" <dhubb...@dino.hostasaurus.com> 
To: nanog@nanog.org 
Sent: Wednesday, May 16, 2018 11:59:42 AM 
Subject: Curiosity about AS3356 L3/CenturyLink network resiliency (in general) 

I’m curious if anyone who’s used 3356 for transit has found shortcomings in how 
their peering and redundancy is configured, or what a normal expectation to 
have is. The Tampa Bay market has been completely down for 3356 IP services 
twice so far this year, each for what I’d consider an unacceptable period of 
time (many hours). I’m learning that the entire market is served by just two 
fiber routes, through cities hundreds of miles away in either direction. So, 
basically two fiber cuts, potentially 1000+ miles apart, takes the entire 
region down. The most recent occurrence was a week or so ago when a Miami-area 
cut and an Orange, Texas cut (1287 driving miles apart) took IP services down 
for hours. It did not take point to point circuits to out of market locations 
down, so that suggests they even have the ability to be more redundant and 
simply choose not to. 

I feel like it’s not unreasonable to expect more redundancy, or a much smaller 
attack surface given a disgruntled lineman who knows the routes could take an 
entire region down with a planned cut four states apart. Maybe other regions 
are better designed? Or are my expectations unreasonable? I carry three peers 
in that market, so it hasn’t been outage-causing, but I use 3356 in other 
markets too, and have plans for more, but it makes me wonder if I just haven't 
had the pleasure of similar outages elsewhere yet and I should factor that 
expectation into the design. It creates a problem for me in one location where 
I can only get them and Cogent, since Cogent can't be relied on for IPv6 
service, which I need. 

Thanks

Re: Curiosity about AS3356 L3/CenturyLink network resiliency (in general)

Reply via email to