Hi, so I'm now following the design that everbody claims is "best" (loopbacks in OSPF, everything else in BGP), and I've found a few corner cases that are seriously worse than "customer routes in OSPF".
Number one - consider the following (simplified) network: Upstream 1 <---> ISP-Router 1 <---> ISP-Router 2 <---> Upstream 2 | Customer X both ISP-Routers announce the ISP's aggregate (let's call it 200.1.0.0/16) to their respective upstream providers (static route to null0, "network" statement). This needs to be done, to make sure that the aggregate is always visible, even if one of the routers is down. Customer X uses addresses from 200.1.0.0/16, let's give him 200.1.1.1/32. So, when "ISP-Router 1" boots, the following happens, more or less in this order: 1. bootup complete 2. OSPF neighbor establishes with ISP-Router 2 3. eBGP-Session to "Upstream 1" establishes, 200.1.0.0/16 is announced (only a single prefix is announced outbound) 4. iBGP-Session to "ISP-Router 2" establishes, 200k prefixes start propagating ISP-R2 -> ISP-R1 (full table at ISP-R2) 5. Traffic starts flowing from "Upstream 1" to "ISP-Router 1" (because the Upstream router is installing the 200.1.0.0/16 route right away) 6. <20-60 seconds delay> 7. ISP-R1 has processed all the BGP prefixes from ISP-R2, has built a FIB, and programmed everything in its hardware forwarding engines. 8. Traffic from "Upstream 1" to "Customer X" can be forwarded properly the crucial element here is: between the items "5" and "8", packets coming from "Upstream 1" to "Customer X" are *dropped*, because ISP-R1 has no full internal reachability information yet, but is still announcing reachability for the aggregate to "Upstream 1". The 20-60 seconds delay comes from the fact that even if the eBGP and iBGP sessions are established at roughly the same time, the eBGP session only has to announce one single prefix ("instantaneous"), while the iBGP session will see ~200k prefixes, "Customer X" being just one of them, fairly far down at the end (200.1.1.1/32). So - now I'm wondering if it's only me? Shouldn't this problem bite other folks as well? The "other" design (customer routes in IGP) doesn't suffer from it, as IGP is usually done converging before BGP starts. But we don't want that. One possible solution would be to have a knob that tells IOS "delay bringing up eBGP sessions and/or announcement of routes on eBGP sessions for <n> seconds after initial BGP startup". This would make sure that iBGP has converged before eBGP starts, and no transient black-holing is seen. Is that possible? I have googled and stared at the command-line help for a while, but couldn't find anything useful. Routers in question are 6500s with SXI2a. gert -- USENET is *not* the non-clickable part of WWW! //www.muc.de/~gert/ Gert Doering - Munich, Germany g...@greenie.muc.de fax: +49-89-35655025 g...@net.informatik.tu-muenchen.de
pgpyi7FUkap0q.pgp
Description: PGP signature
_______________________________________________ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/