Re: Haproxy failover: DNS RR vs Virtual IP (heartbeat, keepalived)

Willy Tarreau Wed, 05 Jan 2011 23:20:21 -0800

Hi David,

On Thu, Jan 06, 2011 at 02:19:23PM +0900, David wrote:
> Hi,
> 
> Let's say I have an architecture where a couple of servers are put 
> behind a haproxy instance for load balancing, to serve content at 
> www.example.com. For reliability/availability purpose, I want to have 
> more than one haproxy instance to ensure continuous servicing when one 
> haproxy fails:
> 
>           LB1           LB2
>            |           |
> ------------------------------------
>     |          |             |
> Server1             Server2       Server3
> 
> The issue is how to "distribute" the load across both load balancers. I 
> am aware of at least two solutions:
> 
>   - DNS Round Robin: www.example.com is resolved to both LB1 and LB2's 
> IP. If e.g. LB1 crashes, clients will then look at the next entry, LB2 
> in this case
>   - High Availability IP (heartbeat, keepalive) between both load 
> balancers. Only one load balancer is proxying all the requests at a time 
> (I assume one load balancer has enough power to serve all our traffic).
> 
> I have been asked to implement the DNS RR method, but from what I have 
> read, method 2 is the one most commonly used. What are the pros/cons of 
> each ?


The first one is just pure theory. You may want to test it by yourself
to conclude that it simply does not work at all. Most clients will see
a connection error or timeout, and few of them will be able to perform
a retry on the other address but after some delay which will cause some
unpleasant experience. Also, most often the browser does not perform a
new lookup if the first one has already worked. That means that until
the browser is closed, the visitor will remain bound to the same IP.

Then you might think that it's enough to update the DNS entries upon
failure, but that does not propagate quickly, as there are caches
everywhere. To give you an idea, the haproxy ML and site were moved to
a new server one month ago, and we're still receiving a few requests a
minute on the old server. In general you can count on 1-5% of the visitors
to cache an entry more than one week. This is not a problem for a disaster
recovery, but it certainly is for a server failover because that means you
cannot put it offline at all.

High availability has the big advantage of always exposing a working
service for the same IP address, so it's a *lot* more reliable and
transparent to users. There are two common ways to provide HA under
Linux : 
  - heartbeat
  - keepalived

The first one is more suited to data servers, as it ensures that no more
than one node is up at a time. This is critical when you share file systems.
The second one is more suited to stateless servers such as proxies and load
balancers, as it ensures that no less than one node is up at a time. Sadly
people generally confuse them and sometimes use keepalived for NFS servers
or use heartbeat with haproxy...

High availability presents a small inconvenient though : the backup node
is never used so you don't really know if it works well, and there is a big
temptation not to update it as often as the master node. This is also an
advantage in that it allows you to validate your new configs on it before
loading them on the master node. If you want to use both LBs at the same
time, the solution is to have two crossed VIPs on your LBs and use DNS RR
to ensure that both are used. When one LB fails, the VIP moves to the other
one.

If you stick to the following principles, you should never encounter issues :
  - DNS = load balancing, no availability at all
  - HA = availability, no load balancing at all.
  => use DNS to announce always available IP addresses

Cheers,
Willy

Re: Haproxy failover: DNS RR vs Virtual IP (heartbeat, keepalived)

Reply via email to