Daniel Lieberman wrote on 2014-12-05 19:29:
> Why is this wrong?
> 
> We do use health-checking, but we can generate a lot of 503s in 2s.
> Of course the best answer is to fix the app server code so we don’t
> see this, but the developers have been working on this for a while
> and we’re trying to mitigate in the meantime.
> 

handling 503's in a nice way is IMHO a good idea. I have asked if it
wasn't worth getting haproxy to support the same, as we today use
varnish to do (haproxy sits in front of varnish servers which then has
the backends where we need this functionality ), which is that if
varnish receives a 503 from the backend server - it simply retries query
to the next backend (instead of serving a 503 to visitors) (until it has
tried all backends - only then does it serve a 503).

It means we can release java applications with start up times of minutes
and handle "odd cases" where backend suddenly hits a 503 (we use apache
in front of backends - so it goes 503 when backend does not respond) -
without having to actively remove the nodes before doing it, because
requests that fail (503) - simply gets retried - until the health check
catches that the node is down - and stops sending requests to it.

It's quite easily noticed, and since we peak at 1500 req/s on a daily
basis - a lot of requests can reach the faulty backend, before it's
pulled out by a health check.

-- 
Regards,
Klavs Klavsen, GSEC - k...@vsen.dk - http://www.vsen.dk - Tlf. 61281200

"Those who do not understand Unix are condemned to reinvent it, poorly."
  --Henry Spencer


Reply via email to