Hi! I'm trying to use HAproxy to support the concepts of "offline", "in maintenance mode", and "not working" servers. I have separate health checks for each condition and I have been trying to use ACLs to be able to switch between backends. In addition to the fact that this doesn't seem to work, I'm also not loving having to repeat the server lists (which are the same) for each backend. But perhaps I'm misunderstanding something fundamental here about how I should be tackling this. As far as I can tell, having multiple httpchk's per backend doesn't work in an "if any of these fail, then call mark this server offline" -- I think it's more like "if any of these succeed, mark this server online" -- and that's what's making this scenario complex. That is, the /check can pass but I might have marked the server offline manually or be in the process of deploying and so /maintenance.html exists -- it's not a strictly boolean (online/offline) issue. Here's the setup:
global maxconn 1024 log 127.0.0.1 local0 notice spread-checks 5 daemon user haproxy defaults log global mode http balance leastconn maxconn 500 option httplog option abortonclose option httpclose option forwardfor retries 3 option redispatch timeout client 1m timeout connect 30s timeout server 1m stats enable stats uri /haproxy?stats stats auth hauser:hapasswd monitor-uri /haproxy?monitor timeout check 10000 frontend staging 0.0.0.0:8080 # if the number of servers *not marked offline* is *less than the total number of app servers* (in this case, 2), then it is considered degraded acl degraded nbsrv(only_online) lt 2 # if the number of servers *not marked offline* is *less than one*, the site is considered down acl down nbsrv(only_online) lt 1 # if the number of servers without the maintenance page is *less than the total number of app servers* (in this case, 2), then it is considered maintenance mode acl mx_mode nbsrv(maintenance) lt 2 # if the number of servers without the maintenance page is less than 1, we're down because everything is in maintenance mode acl down_mx nbsrv(maintenance) lt 1 # if not running at full potential, use the backend that identified the degraded state use_backend only_online if degraded use_backend maintenance if mx_mode # if we are down for any reason, use the backend that identified that fact use_backend backup_only if down use_backend backup_only if down_mx # by default, use 'normal ops' default_backend normal backend only_online # if /offline exists, the server has been intentionally marked as offline option httpchk HEAD /offline HTTP/1.0 http-check expect status 404 http-check send-state server App1 app1:8080 check inter 5000 rise 2 fall 2 server App2 app2:8080 check inter 5000 rise 2 fall 2 backend maintenance # if /maintenance.html exists, the server is in maintance mode option httpchk HEAD /maintenance.html HTTP/1.0 http-check expect status 404 http-check send-state server App1 app1:8080 check inter 2000 rise 2 fall 2 server App2 app2:8080 check inter 2000 rise 2 fall 2 backend normal cookie SESSIONID insert indirect option httpchk HEAD /check HTTP/1.0 http-check send-state server App1 app1:8080 cookie A check inter 10000 rise 2 fall 2 server App2 app2:8080 cookie B check inter 10000 rise 2 fall 2 server Backup1 app3:8080 cookie C check inter 10000 rise 2 fall 2 backup backend backup_only option httpchk HEAD /check HTTP/1.0 http-check send-state server Backup1 app3:8080 check inter 2000 rise 2 fall 2