We have been using HAProxy in a production environment, without issue
for a long period. Thanks for a wonderful product!

Unfortunately we recently encountered some issues as we have worked on
the migration of one of our sites onto a new HAProxy based load
balancing solution. We've started to notice issues related to persistent
cookies, client requests and down backend servers.

This new application requires users remain on the same web server to
avoid losing session information, which is not shared between backend
servers. If we stop one of the backend servers (port 80 is no longer
listening, HAProxy receives a RST packet from the server when sending a
health check or client request) the clients who have a persistent
session on this web server will continue to be sent to the until it is
formally declared down (after two health checks fail, as controlled by
the fall 2 parameter).

Here's where we run into our issue:

While HAProxy is receiving RST responses, it sends a 503 response to the
client. We're not very eager to send this error response to the client.
It appeared, from my reading of the documentation, that by setting
"option redispatch" and "retries 3" (or greater than 1) we should get
HAProxy to retry, and, in the event of explicit connection failure from
the backend server, move on to the next functioning server on the final
retry. This doesn't appear to be the case. 

To make matters worse, when HAProxy throws a 503 response because of a
RST it ignores the errorfile directive. If you have two servers, and
stop one you will receive an extremely plain 503 error response. If no
backend is available at all the errorfile directive functions properly,
and the "pretty" error message is returned.

Ideally, in the event that a backend server is returning RSTs, we'd like
to move to the next server. HAProxy could either do this immediately or
buffer the request until it makes the final determination that the
backend server is down and can send it to another server. If that isn't
possible, we'd really like the 503 response returned to the client to be
the one specified in the errorfile. 

I'm going to investigate the possibility of creating a patch for this
issue tonight, though if more experience hands could help, either with
the patch, or with something obvious that I've missed in my
configuration, I'd greatly appreciate it. 

A few other notes:

Interestingly, even upon receiving a RST to a client request to the
backend server, HAProxy doesn't consider the server as having failed a
health check until it performs it's next health check. So, if you have a
health checking interval of 10 seconds, if a customer makes a request 1
second after the first health check, with fall 2 set, it will take 29
seconds before the backend server is declared down and the client moved
on to the next server.

The documentation for the track command could also be made a bit
clearer, it took me a while (and a colleagues examination of the source)
to determine that the <proxy> is another backend, in the case that you
are trying to reference a server from a different backend. (Perhaps,
depending on your configuration, it's not always a backend, but could be
something else?).

Thank you for taking the time to read this novella :), configuration
follows, thanks in advance for your help,

-JohnF

Configuration Details

We are running 1.3.15.7, with the following configuration (excerpt):

global
  stats socket /var/run/haproxy.stat

defaults
  balance roundrobin
  cookie SERVERID insert indirect
  option httpchk GET /index.html HTTP/1.0
  timeout client 10m
  timeout server 10m
  timeout connect 3s

frontend http_frontend *:80
  mode http
  reqirep ^Host:([^:]*) Host:\1
  #Traffic matching ACLs
[...]
  acl host_qa_site_com hdr(host) -i qa.site.com
  use_backend qa_site_com_http if host_qa_site_com
frontend ssl_frontend *:81
  mode http
  reqirep ^Host:([^:]*) Host:\1
  #Traffic matching ACLs
[...]
  acl host_qa_site_com hdr(host) -i qa.site.com
  use_backend qa2_site_com_ssl if host_qa_site_com
[...]

backend qa_site_com_http
  mode http
  errorfile 503 /etc/haproxy/errorfiles/503.http
  option redispatch
  retries 3
  option httpchk GET /ut.asp
   server web_1 web1:80 cookie 3f06565277298ce80af6bbaab8c5b584 check
inter 5100ms fall 2
   server web_2 web2:80 cookie bc62407de9ecf95bae662880b593a0d4 check
inter 5100ms fall 2
backend qa_site_com_ssl
  mode http
  errorfile 503 /etc/haproxy/errorfiles/503.http
  option redispatch
  retries 3
  option httpchk GET /ut.asp
   server web_1 web1:80 cookie 3f06565277298ce80af6bbaab8c5b584 track
qa_site_com_http/web_1
   server web_2 web2:80 cookie bc62407de9ecf95bae662880b593a0d4 track
qa_site_com_http/web_2

Reply via email to