Re: Hijacked IP space.
On Mon, Nov 03, 2003 at 09:17:38PM +, Andrew - Supernews wrote: chuck == chuck goolsbee [EMAIL PROTECTED] writes: chuck Of course I have no hard data, other than my client's phone chuck call about another phone call, so I can't query based on a chuck timestamp to see where this was being announced from. It chuck appears to vanished, and has remained so according to my chuck casual glances here and there. chuck The netblock in question is: chuck 204.89.0.0/21 No announcement for that block has been visible here at any time in the past couple of weeks (specifically, since Oct 13). We might have missed it if it was never announced for more than a few minutes at a time, but it's _much_ more likely that the block was never announced and was merely forged into headers of a spam. Our system reports that neither that prefix, nor any of its more-specifics, has been seen in the global routing tables at any moment since January 1st, 2002. [ http://www.renesys.com ] -- James Cowie Renesys Corporation cowie at renesys.com
Re: BellSouth prefix deaggregation (was: as6198 aggregation event)
On Sun, Oct 12, 2003 at 01:18:59AM -0400, Terry Baranski wrote: More on this - Two of BellSouth's AS's (6197 6198) have combined to inject around 1,000 deaggregated prefixes into the global routing tables over the last few weeks (in addition to their usual load of ~600+ for a total of ~1,600). Kudos to BellSouth for taking first steps to clean this up overnight -- about 1100 of the prefixes left the table between about 01:45 and 03:45 GMT. Very nice deflation. Next topic: multiple origin ASNs .. -- James Cowie Renesys Corporation [EMAIL PROTECTED] 603-464-5799 voice 603-464-6014 fax
as6198 aggregation event
On Friday, we noted with some interest the appearance of more than six hundred deaggregated /24s into the global routing tables. More unusually, they're still in there this morning. AS6198 (BellSouth Miami) seems to have been patiently injecting them over the course of several hours, between about 04:00 GMT and 08:00 GMT on Friday morning (3 Oct 2003). Here's a quick pic: http://www.renesys.com/images/6198.gif I can share more details offline. Usually when we see deaggregations, they hit quickly and they disappear quickly; nice sharp vertical jumps in the table size. This event lasted for hours and, more importantly, the prefixes haven't come back out again, an unusual pattern for a single-origin change that effectively expanded global tables by half a percent. Can anyone share any operational insight into the likely causes of this event? Has it caused any operational problems for anyone? Thanks! -- James Cowie Renesys Corporation
Re: what happened to ARIN tonight ?
On Sun, Sep 28, 2003 at 10:26:00PM -0400, Robert Boyle wrote: At 10:07 PM 9/28/2003, you wrote: I am seeing the same. ARIN is completely off the air box02rsm-en01.twdx.net sh ip bgp 192.149.252.16 % Network not in table I see them via a UUNet announcement through Veroxity and Sprint transit, but I don't see it via any other peer or transit provider. Are they multi-homed? Single-homed /24 through UUNet's 7046 to 701. Withdrawals started at 01:21:38 GMT (21:21:38 Eastern time), and ARIN flapped severely for about fifteen minutes. Then they spent another hour and ten minutes inconsistently reachable from half the world, with the picture mutating slowly. 701 seems to have been telling inconsistent stories about ARIN's reachability, depending on which of our peers you consulted -- IGP instability? By 03:10 GMT everyone seems to have slowly gotten a consistent picture again, with ARIN restored. AS7046 advertises 320 prefixes, of which ARIN is but one. Others (139.177.0.0/16, 139.85.0.0/16, 192.136.136.0/24) had even worse problems in the same timeframe. My guess would be the collision of two maintenance windows -- a simple problem on ARIN's end at the wrong moment, with global convergence severely prolonged by one of the UUNet BGP tummy-rumbles that we commonly hear in the middle of the night. Some of the little UUNet ASNs with 701 upstream can be really noisy -- 816 was rocking all night long. -- James Cowie Renesys Corporation [EMAIL PROTECTED]
Re: On the back of other 'security' posts....
On Sat, Aug 30, 2003 at 08:17:39PM +1000, Matthew Sullivan wrote: Hi All, On the back of the latest round of security related posts, anyone notice the 50% packet loss (as reported to me) across the USA - NZ links around lunchtime (GMT+10) today? Yep, easily .. we saw big routing problems for that /19 and a lot of innocent bystanders, in two waves, correlated across most sources. First signs of trouble at 23:12 GMT. Then long periods of unreachability, from roughly 23:15-23:42 GMT and 01:26-02:10 GMT (the second one less well-correlated; routes were still available from some of our peers). Routes were fully restored globally by 02:12 GMT, presumably after the upstream rate limiting kicked in. If anyone has mrtg plots for the affected links, I'd sure like to see 'em. -- James Cowie Renesys Corporation [EMAIL PROTECTED]
Re: Optigate
On Mon, Aug 25, 2003 at 09:17:49AM -0700, Mark V. A. Bown wrote: This morning at 4:30am pst 200 Paul St. facility in San Francisco went dark. The power died and took our 200 Paul St. Router offline. The power was restored at 8:00am pst. I think you mean PDT. Also, while DC went out at 4:30AM and was restored at 8:00AM, AC power went out at around 3:00AM and wasn't restored until 8:30AM. Both 65.60.0.0/18 and 64.200.195.0/24 were out from 03:09:19 PDT; they were restored starting at 8:12:27 PDT, converged globally by 08:14:15 PDT. -- James Cowie Renesys Corporation [EMAIL PROTECTED]
Re: BGP route tracking.
Anybody watching the bgp routing table.. I see about 5,000 less routes than usual. Anybody know a good pointer.. Okay, here are a couple quick screenshots of what we're looking at tonight. [..] We've collected some more plots and maps describing BGP outage patterns during last week's blackout. Feedback welcome. http://www.renesys.com/news -- James Cowie Renesys Corporation
Re: BGP route tracking.
Some updated images of routing table size and 7-day prefix withdrawals: http://gradus.renesys.com/aug2003/blackout3-rtsize.gif http://gradus.renesys.com/aug2003/aug7-aug15-withdrawals.gif (Blackout is event #3 on the right.) We're about halfway back to the table sizes we started with yesterday before the grid tripped. If it trends the same way through the afternoon, it could be back to its old self sometime around midnight GMT; previous experience suggests that it might stabilize somewhat higher than before the event. At any rate, steady improvement as power returns across the East. -- James Cowie Renesys Corporation [EMAIL PROTECTED] Okay, here are a couple quick screenshots of what we're looking at tonight. First, a plot that shows the routing table size shrinkage since the onset of the blackout at 16:13:07 +/- EDT, across a group of routers. http://gradus.renesys.com/aug2003/blackout1-rtsize.gif Second, a wider-angle 3D plot of prefix withdrawal rates over the last week, reported by various peers (one line per). http://gradus.renesys.com/aug2003/aug7-aug14-withdrawals.gif The blackout is the big event at the right hand edge (#3). Note the sustained high rates of route withdrawal that have been the norm since the onset of MSBlast. Unlike typical single-cause events (like the one marked #1), MSBlast scanning has caused prefix withdrawal rates to gently lift off into a noiser mode across the board, lasting for days (so far). (im just getting bored waiting for us to run out of fuel) Ouch .. -- James Cowie Renesys Corporation [EMAIL PROTECTED]
Re: Level3 routing issues?
Wow, for a minute I thought I was looking at one of our old plots, except for the fact that the x-axis says January 2003 and not September 2001 :) :) seeing that the etiology and effects of the two events were quite different, perhaps eyeglasses which make them look the same are not as useful as we might wish? randy If you've been watching, you might agree that the interesting thing is not that it looked like that in September 2001, but that we really haven't seen a signal that looks like that SINCE September 2001. The large differences between the worms are exactly what should make us doubly interested in fingering the common mechanism that connects very high speed, high diversity wormscan to increased bgp activity. So far it's been visible as an apparently accidental byproduct of an attack with other goals. Are you willing to bet your bifocals that the same mechanism can't be weaponized and used against the routing infrastructure directly in the future? -- James Cowie Renesys Corporation http://gradus.renesys.com
Re: Level3 routing issues?
So far it's been visible as an apparently accidental byproduct of an attack with other goals. Are you willing to bet your bifocals that the same mechanism can't be weaponized and used against the routing infrastructure directly in the future? Yet the question becomes the reasoning behind it. How much is a direct result of the worm and how much is a result of actions based on the NE's? Good question. null routing of traffic destined to a network with a BGP interface on it will cause the session to drop. That is a BGP effect due to engineers' actions, indirectly triggered by the worm. On the other hand, we also know (from private communications and from other mailing lists.. ahem) that high rate and high src/dst diversity of scans causes some network devices to fail (devices that cache flows, or devices that suffer from cpu overload under such conditions). Some BGP-speaking routers (not all, by any means, but some subpopulation) found themselves pegged at 100% CPU on Saturday. Just one example: http://noc.ilan.net.il/stats/ILAN-CPU/new-gp-cpu.html Whether you believe anthropogenic explanations for the instability depends on how fast you believe NEs can look, think, and type, compared to the speed with which the BGP announcement and withdrawal rates are observed to take off. For my part, I'd bet that the long slow exponential decay (with superimposed spiky noise) is people at work. But the initial blast is not. -- James Cowie Renesys Corporation http://gradus.renesys.com
Re: Level3 routing issues?
here's a plot showing the impact on BGP routing tables from seven ISPs (plotted using route-views data): http://www.research.att.com/~griffin/bgp_monitor/sql_worm.html And as an interesting counterpoint to this, this graph shows the number of BGP routing updates received at MIT before, during, and after the worm (3 day window). Tim's plots showed that the number of actual routes at the routers he watched was down significantly - these plots show that the actual BGP traffic was up quite a bit. Probably the withdrawals that were taking routes away from routeviews... http://nms.lcs.mit.edu/~dga/sqlworm.html -Dave Wow, for a minute I thought I was looking at one of our old plots, except for the fact that the x-axis says January 2003 and not September 2001 :) :) Your plot is consistent with what we saw on Saturday as well. Looks much like a little Nimda. Blast from the past: http://www.renesys.com/projects/bgp_instability --jim -- James Cowie Renesys Corporation http://gradus.renesys.com
Re: MIA: oregon-ix.net
As some of you have noticed, the BGP4 route containing the address for route-views.oregon-ix.net has disappeared a while ago (mid-October?). Their website seems to be gone, and I swear, I couldn't resolve the domain for a little while just now. Has the Oregon IX been shut down? As others have noted, they just had DNS problems. Their routes appear to be live. In fact, the stability of 198.32.162.0/24 is pretty good, by and large. They did have one global outage of about an hour and a half on October 1st, starting at 12:03 GMT. Also, back on September 13th, between 12:32 and 13:51 GMT they were (accidentally or deliberately) being originated by 15919 (Interhost), creating a brief blackhole situation. They're otherwise usually advertised by 3701, although you'll also see Verio originating them depending on where you look. Their route-server was probably the best-connected one, with the most views, of any public route server I am aware of (please prove me wrong, but do not torment me with any web-based looking glasses :) . Yeah, for real forensics, neither looking glasses nor public route servers are ideal solutions. The former have single-site myopia and the latter have no good tools. That's why we built our own infrastructure (http://gradus.renesys.com). Nothing like having to poke around 10 other RS's to establish that rogue AS 26212 really only has 1, 6402 and 2914 as their upstreams. Also 2516, 3257, 4513, 6730, and 6939, just in the last few weeks. --jim
Re: Do ATM-based Exchange Points make sense anymore?
Mike Hughes wrote: With the shorter timers or fast-external-fallover, a very short maintenance slot at a large exchange can cause ripples in the routing table. It would be interesting to do some analysis of this - how far the ripples spread from each exchange! We do BGP instability research, and this is something I'd like to examine further. Compared to other sources of BGP noise, I don't think it's a primary driver for the instability we monitor each day, but I'd like to quantify it. If people were willing to give us a heads-up after the fact when there were .. um .. maintenance events at the major exchanges, we could then go back and look at global propagation of the ripples on fine timescales. --jim p.s. The more eyes we have, the more we see, and we are always looking for more silent peers, especially small and midsize providers or their multihomed customers. See http://renesys.com/cgi-bin/bgpfeed to sign up to send us a one-way multihop EBGP feed. It's quick, painless, and you will be helping unravel the mysteries of why global routing works so well in spite of us all.
Re: effects of NYC power outage
[NANOG has been bouncing my attempts to reply to this thread for several days, possibly because I quoted the word u n st a b l e early on, apparently triggering the un subs cribe filter for words that start with uns and contain a b.. If your posts to NANOG have been silently bounced in the past, and your network's operational issues lead you to start your posts with words like uns tab le or uns ol vable or uns uit able or u nsp eak able, wonder no more. ] At any rate, about two days ago Senthil wrote: BGP was more [un st a ble] during code red propagation(http://www.renesys.com/projects/bgp_instability/.) A quick peek into both the graphs will make one thing clear: *BGP is robust enough to withstand any extreme congestion.* Anyone interested in this might also like to look at our report titled Internet Routing Behavior on 9/11 and in the Following Weeks. http://www.renesys.com/projects/reports/renesys-030502-NRC-911.pdf Note in particular the minute-by-minute changes in routing table size around critical events on pages 9 through 11. Fine time granularity is important to avoid missing all the interesting features. --jim