Redundant / High available BIRD

2013-10-02 Thread 1 Игорь

Common scheme, that used by ISP is
>
>   ISP 1 --- BIRD 1 --- Switch
>   |    \ /
>   | X
>   |    / \
>   ISP 2 --- BIRD 2 --- Switch
>
> This actually requires two running BIRDs but leaves me with the question
> how to deal with the IP address on the internal side. So in theory I would
> have two virtual standard gateways for connected internal equipment?!

If you have two uplinks, I guess, you have large network, which uses IGP, and 
all things go simple:
you just need to redisltibute 0/0 route and/or more speceific routes (it is bad 
idea to redistribute fullview to IGP) from Border Routers via IGP. Between 
Border Routers you
need to set up iBGP session. So if you have failure in some point - all will be 
ok.
If one border fails - IGP stops to receive routes from it and all traffic will 
go through another working Border.
If one of ISP link fails, Border with failed link will send traffic to another 
Border with working.

If you have two uplinks with BGP and dont have IGP, for some reasons, and 
customers connects directly to BR it will be good idea to run
VRRP between BR in local network. Which allows you to create active/backup GW.
On one BR you will have virtual IP which you could assign as default GW, and if 
it fails,
this Virtual IP will transparently moved to another Border.

> Maybe I'm also totally on the wrong road. The basic plan is two different
> ISP connections, two Linux systems running BIRD and Corosync with Pacemaker
> to achive high availability - and later some peering partners. I would like
> to see a fast automated failover in case a link or a hardware breaks down.

Corosync and Peacemaker is wrong road. It's for end-point applications like 
web-servers.
For networks, if you want HA, you just need more nodes and proper settings


Redundant / High available BIRD

2013-10-01 Thread Robert Scheck
Hello list,

even I spent some time to BGP and BIRD, I am somehow still a newbie with
the goal to set up a redundant/high available BGP using BIRD in the future.
Initially I thought about putting BIRD onto a Linux system with Corosync
and Pacemaker for high availability. The two machines would be connected
also directly for heartbeat communication. This would be an active/standby
setup - but I'm not sure if this is a good idea (physical layout):

  ISP1 ---+  +--- BIRD 1 --- Switch
  |  |  |\ /
 Switch | X
  |  |  |/ \
  ISP2 ---+  +--- BIRD 2 --- Switch

Note that the "X" is just intersected not connected. This brings obviously
a single point of failure: The first switch. My next idea was doubling the
switch which requires of course two physical cables per ISP:

  ISP 1 --- Switch 1 --- BIRD 1 --- Switch
\ /  \ /   |\ /
 XX| X
/ \  / \   |/ \
  ISP 2 --- Switch 2 --- BIRD 2 --- Switch

This seems to be...expensive. Even this idea left me with an active/standby
setup at BIRD. So Pacemaker would start BIRD on one of the two servers and
stop it on the other. During maintainance of the servers the active/standby
would be changed to standby/active - but resets all BGP sessions and seems
to be especially disliked by eventually later peering partners. Correct me,
if I am wrong here, please. I also thought about this setup:

  ISP 1 --- BIRD 1 --- Switch
  |\ /
  | X
  |/ \
  ISP 2 --- BIRD 2 --- Switch

This actually requires two running BIRDs but leaves me with the question
how to deal with the IP address on the internal side. So in theory I would
have two virtual standard gateways for connected internal equipment?!

Maybe I'm also totally on the wrong road. The basic plan is two different
ISP connections, two Linux systems running BIRD and Corosync with Pacemaker
to achive high availability - and later some peering partners. I would like
to see a fast automated failover in case a link or a hardware breaks down.

Searching on the Internet brings lots of BSD-based setups with CARP/pfsync
but only less Linux-based ones, somebody noticed keepalived. So is Corosync
and Pacemaker just the wrong attempt?

Are there some recommendations or best practices on physical and software
structure? Do you run similar setups and how do they look like? What are
you using on the software level - except BIRD? And how long does a failover
take for you? Are you running active/active or active/standby?

I am also looking for hardware recommendations: If I am not completely mis-
taken, above requires (depending on the configuration) up to two full BGP
tables in memory...how much is that for IPv4 and IPv6? How about the CPU? I
can read in various documents that BIRD requires less CPU than others, but
what does less mean in a rough absolute value in GiB for above situation?

Thank you for reading my long e-mail and sorry for taking your time! Feel
also free to just send me some pointers in case I missed some documentation
or other useful links.


Greetings,
  Robert