Hi all,

I've noticed an odd (lack of) interaction between "maxconn" and "option 
httpchk"... 

If a server's maxconn limit has been reached, it appears that HTTP health 
checks are still dispatched. If I've configured the maxconn limit to match the 
number of requests the backend server can concurrently dispatch, and all these 
connections are busy with slow requests, HAProxy will assume the server is 
down; once the server completes a request, HAProxy waits until "rise" health 
checks have succeeded (as expected if the server was really down, but it was 
only busy). This makes overly busy times even worse.

I'm not sure if this explanation is clear; perhaps a concrete configuration 
might help.

listen load_balancer
        bind :80
        mode http

        balance leastconn
        option httpchk HEAD /healthchk
        http-check disable-on-404

        default-server port 8080 inter 2s rise 2 fall 1 maxconn 3
        server srv1 srv1.example.com:8080 check
        server srv2 srv2.example.com:8080 check

With the above toy example, if each of srv1 and srv2 can only respond to 3 
requests concurrently, and 6 slow requests come in (each taking more than 2 
seconds), both backend servers will be considered down until up to 4 seconds in 
the worst case (inter 2s * rise 2) after one of the requests finishes.

I know I can work around this by setting maxconn to one less than a server's 
maximum capacity (perhaps this would be a good idea for other reasons). I 
suspect I could work around this by using TCP status checks instead of HTTP 
status checks, though I haven't tried this as I like the flexibility HTTP 
health checks offer (like "disable-on-404").

Is this behavior a bug or a feature? Intuitively I would have expected the HTTP 
health checks to respect maxconn limits, but perhaps there was a conscious 
decision to not do so (for instance, maybe it was considered unacceptable for a 
server's health to be unknown when it is fully loaded).

Thanks,
Bryan

Reply via email to