Hi Everyone,

I’ve been looking at improving our BGP configuration lately, and I would just 
like to see if I’m missing anything obvious in terms of speeding up BGP 
convergence (particularly inbound convergence) with our transit providers 
during failover.  I understand that BGP convergence on the internet is not 
going to be perfect, but I am trying to ensure I tune things as best I can. We 
are using Cisco ASR1001-Xs for reference, though I’m more wondering about 
general best practices that other ISPs use, that I can then adapt to our 
network.

I’m also curious if my expectations of trying to minimise convergence times 
with transit peers are realistic or not.

Right now, I am seeing 20-30 second outage windows when failing over my 
announced prefixes from one transit provider to another. I can understand this 
when transitioning between transits, but I see this even when failing over 
between a primary/secondary peering session with a single AS / transit 
provider, which is disappointing. My hope was to have almost no interruption 
where we have multiple links with a given transit provider, and small 
convergence window (maybe 2-5s?) when transitioning prefixes from one transit 
to another.

With iBGP seems like there’s lots of options and it would be possible to 
achieve sub-second convergence fairly easily. But eBGP is where it becomes more 
limited and difficult to improve the situation.

For iBGP I can do:


  *   BFD
  *   BGP Multipath – I haven’t tested, but I assume having multiple paths in 
the FIB will speed up failover convergence.
  *   BGP Best External
  *   Add Path

For eBGP I can do:


  *   BFD (If supported by the upstream – I have this on all peers)
  *   Advertisement Internal – I have set this to 0 (doesn’t make a major 
difference, but helps a little)
  *   BGP Multipath (if supported by the upstream – unfortunately my upstream 
requires the primary/secondary paths are enforced on their side via localpref 
so I can’t leverage this).
  *   AS Path prepending of the same prefixes instead of announcing less/more 
specific prefixes at different sites seems to help.

I haven’t found any other commonly accepted methods of announcing a backup path 
to eBGP peers.

We are using Equinix Connect transit in Sydney as our main transit, where we 
have primary and secondary links between us and Equinix. And Vocus as our main 
transit in Melbourne, with the intention of failing over all our announced 
prefixes between sites as required, by leveraging AS Path prepending.

Are there any other techniques or best practices I am missing to help try and 
reduce downtime in the event of a router or BGP session failure event?

Appreciate any insights you can offer, and hope this proves to be a useful and 
interesting discussion for others.

Thanks!

Rhys Hanrahan
Chief Information Officer
Nexus One Pty Ltd

E: supp...@nexusone.com.au<mailto:supp...@nexusone.com.au>
P: +61 2 9191 0606
W: http://www.nexusone.com.au/
M: PO Box 127, Royal Exchange NSW 1225
A: Level 10 307 Pitt St, Sydney NSW 2000

[ttp://quintus.nexusone.com.au/~rhys/nexus1-email-sig.jpg]
_______________________________________________
AusNOG mailing list
AusNOG@lists.ausnog.net
http://lists.ausnog.net/mailman/listinfo/ausnog

Reply via email to