That's why you make sure that any incidents where max-prefix is tripped is caught by a syslog watcher and brought to the immediate attention of whoever's sitting in your NOC. Honestly, if all you're dealing with is customer BGP session, I would propose that 90% of them don't advertise more than 10 prefixes, so a max-prefix number higher than, say, 100 should do for most cases. And for that last 10%, max-prefix is a per-session configuration, so that number can always be set higher. IMO, advertising 100 routes for 30 seconds is far less damaging than 8000 routes.
Also, don't forget about the warn option - if a customer's organic growth puts them close to the prefix limit, you should get a heads-up in most cases. I recall an incident where we brought up a customer advertising around 600 routes, and sent the prefix list our upstream, who dutifully added all 600 routes to the prefix list, but neglected to raise their maximum-prefix limit from 300. This, of course, had predictable results. Doh. -C > This isn't a terribly cisco-specific reply so I'll keep it here. > > The problem with restart systems (btw thank you cisco for finally adding > this) is, think about how much damage can be done by announcing 8k routes > for the 30 seconds (or 5-10 minutes if there is a Foundry in the mix :P) > before you get to the limit and kill the session. Now add in the damage > caused by this happening every 15 minutes, and the dampening. Or even > worse, someone who turns up more routes and happens to hit right around > the exact number or close to it. Imagine a session which goes over by 1 > route, trips, stays down for 15 minutes, comes back up and this time has 1 > less route, and noone notices the prefix limit needs to be raised. You > should make sure that the restart time exceeds the number/length of flaps > necessary to trigger dampening, which on a connect you transit is pretty > darn hard to accurately guess. > > IMHO, using only prefix limits on a customer is actually doing them (and > the rest of the internet that listens to your announcements) a disservice. > > A better system might be where the session is kept up (or periodically > polled, if you want to make it obvious to the other party that there is a > problem) without installing the routes, and kept in a "quarantine" state > for X amount of time to make sure that things stay below a configured > number. This would be at least a slightly better way of recovering quickly > once the "problem" has passed, without mucking things up every 15 minutes > in the process. > > -- > Richard A Steenbergen <[EMAIL PROTECTED]> http://www.e-gerbil.net/ras > PGP Key ID: 0x138EA177 (67 29 D7 BC E8 18 3E DA B2 46 B3 D8 14 36 FE B6)
msg04437/pgp00000.pgp
Description: PGP signature