[ Long message and proposal follows. Bear with me. There are a lot of words, but that is because we need a lot of help/input! ;-) ]
So, this has come up in the past several times, and we discussed it again this year at ApacheCon: How do we get the load balancer to make smarter, more informed decisions about where to send traffic? The different LB methods provide some different attempts at balancing traffic, but ultimately none of them is "smart" about its decision. Other than a member being in error state, the balancer makes its decision solely based on configuration (LB set, factor, etc.) and its own knowledge of the member (e.g. requests, bytes). What we have often discussed is a way to get some type of health/load/capacity information from the backend to make informed balancing decisions. One method is to use health checks (a la haproxy, AWS ELBs, etc.) that request one or more URLs and the response code/time indicates whether or not the service is up and available, allowing more proactive decisions. While this is better than our current state of reactively marking members in error state based on failed requests, it still provides a limited view of the health/state of the backend. We have also discussed implementing a way for backends to communicate a magical "load" number to the front end to take into account as it balances traffic. This would give a much better view into the backend's state, but requires some way to come up with this calculation that each backend system/server/service/app must provide. This then has to be implemented in all the various backends (e.g. httpd, tomcat, php-fpm, unicorn, mongrel, etc., etc.), probably a hard sell to all of those projects. And, the front end would have limited control over what that number means or how to use it. During JimJag's balancer talk at ApacheCon this year, he brought up this issue of "better, more informed" decision making again. I put some thought into it that night and came up with some ideas. Jim, Covener, Trawick, Ruggeri, and I then spent some time over the next couple of days talking it through and fleshing out some of the details. Based on all of that, below is what I am proposing. I have some initial code that I am working on to implement the different pieces of this, and I will put them up in bugz or somewhere when they're a little less rudimentary. -- Our hope is to create a general standard that can be used by various projects, products, proxies, servers, etc., to have a more consistent way for a load balancer to request and receive useful internal state information from its backend nodes. This information can then be used by the *frontend* software/admin (this is the main change from what we have discussed before) to calculate a load factor appropriate for each backend node. This communication uses a new, standard HTTP header, "X-Backend-Info", that takes this form in RFC2616 BNF: backend-info = "version" "=" numeric-entry [ *LWS "," *LWS #( numeric-entry | string-entry ) ] numeric-entry = numeric-field "=" ( float | <"> float <"> ) ; that is, numbers may optionally be enclosed in ; quotation marks float = 1*DIGIT [ "." 1*DIGIT ] numeric-field = "workers-max" ; maximum number of workers the backend supports | "workers-used" ; current number of used/busy workers | "workers-allocated" ; current number of allocated/ready workers | "workers-free" ; current number of workers available for use ; (generally the difference between workers-max and ; workers-used, though some implementations may have ; a different notion) | "uptime" ; number of seconds the backend has been running | "requests" ; number of requests the backend has processed | "memory-max" ; total amount of memory available in bytes | "memory-used" ; amount of used memory in bytes | "memory-allocated" ; amount of allocated/committed memory in bytes | "memory-free" ; amount of memory available for use (generally ; the difference between memory-max and memory-used, ; though some implementations may have a different ; notion) | "load-current" ; the (subjective) current load for the backend | "load-5" ; the (subjective) 5-minute load for the backend | "load-15" ; the (subjective) 15-minute load for the backend string-entry = string-field "=" ( token | quoted-string ) string-field = "provider" ; informational description of backend information ; provider (module, container, subsystem, app, etc.) As used here, "worker" is an overloaded term whose precise meaning is backend-dependent. It might refer to processes, threads, pipelines, or whatever the backend system/server/service/app uses to measure or limit its number of active, processing connections. The process-flow looks like this: 1. The frontend (periodically based on time or requests, or on demand) as part of either (1) a normal proxied request or (2) a dedicated health check adds an "X-Backend-Info" request header to a backend request, informing the backend that it wants node state information. I.e.: X-Backend-Info: version=1.0 2. The backend node receives a request with an "X-Backend-Info" header specifying a version it supports. 3. A supporting backend node SHOULD insert one or more "X-Backend-Info" response headers with any subset of the backend-info fields that it supports, including the required "version" field. The version of information provided MUST be less than or equal to the version requested. (The fields are standardized so that various frontends know what to expect, rather than each backend system/server/service/app creating its own fields/values.) E.g.: X-Backend-Info: version=1.0, provider="Backend X", workers-max=1000, workers-used=517, workers-free=483, uptime=19234, requests=85939 4. The backend MUST add the "X-Backend-Info" token to the "Connection" response header, making it a hop-by-hop field that is removed by the frontend from the downstream response (RFC2616 14.10 and RFC7230 6.1). [Note there appears to be an httpd bug here that I intend to submit and that needs to be addressed.] Connection: X-Backend-Info 5. The frontend parses the backend-info entries in the received "X-Backend-Info" response header. The values are then used as part of either an internal or an administrator-specified calculation to determine the load factor or weight of that node for subsequent requests. 6. The frontend MUST remove the "X-Backend-Info" hop-to-hop response header per RFCs. -- As for httpd implementation, this has two pieces. The first is when httpd is used as a backend node behind a load balancer and must provide X-Backend-Info response data. For this, I have created a module tentatively named mod_proxy_backend_info that does nothing except insert an output filter to populate the response header with version, provider, workers-*, request, uptime, and load-* values when the request header is present. Here is an example request-response: % curl -v -H 'X-Backend-Info: version=1.0' http://localhost/ * Trying 127.0.0.1... * Connected to localhost (127.0.0.1) port 80 (#0) > GET / HTTP/1.1 > User-Agent: curl/7.41.0 > Host: localhost > Accept: */* > X-Backend-Info: version=1.0 > < HTTP/1.1 200 OK < Date: Thu, 30 Apr 2015 04:32:08 GMT < Server: Apache/2.4.9 (Unix) PHP/5.5.14 < Last-Modified: Wed, 15 Apr 2015 14:04:54 GMT < ETag: "2d-513c3d4d78d80" < Accept-Ranges: bytes < Content-Length: 45 < X-Backend-Info: version=1.0, provider="mod_proxy_backend_info [Apache/2.4.9 (Unix) PHP/5.5.14]", workers-max=256, workers-busy=1, workers-ready=4, workers-free=255, uptime=1448, requests=3, load-current=1.737305, load-5=1.733887, load-15=1.668457 < Connection: X-Backend-Info < Content-Type: text/html < <html><body><h1>It works!</h1></body></html> The second piece is when httpd is used as the load balancer. For this, I have created a module tentatively named mod_lbmethod_bybackendinfo that will: 1. Periodically (based on elapsed time, number of requests, or both since last update) insert the X-Backend-Info request header into a proxied request. 2. Parse and remove the X-Backend-Info response header. 3. Calculate the member's "informed" load factor based on a formula specified by the user/admin in the configuration. I hope to just use the existing lbfactor field to store this calculated value. Then we can use existing logic to balance based on lbset and lbfactor for subsequent requests. 4. Store the current time and request count in the member's data structure so the lbmethod knows when it needs to be updated again. What I need from all of you: - Input/commentary on the proposed idea, approach, and implementation. Renaming things, additional fields that might be useful, etc., are all up for discussion. - Help with handling the configuration formula mentioned in #3 above. Can we just add some math operators to the expression parser to handle this? What all operations/functions might we need (+-*/? max? min? ternary if-then-else? ...)? A simple-ish example (something like this maybe?): <Proxy "balancer://..."> BalancerMember ... ... ProxySet \ lbmethod=bybackendinfo \ backendupdateseconds=30 \ backendupdaterequests=100 \ backendformula="%{BACKEND:uptime} -lt 120 ? 1 : %{BACKEND:workers-free} / %{BACKEND:workers-max} * 100" </Proxy> - [Near-long-term] Help adding X-Backend-Info backend support and documentation to various projects. Tomcat, php-fpm, others(?) should be fairly easy to implement and submit patches. This work does us no good if none of our backends support it. - [Long-term] Help adding X-Backend-Info frontend support and documentation to various projects to help this become an "accepted ad-hoc standard"...or something like that. Nginx, haproxy, and many others would be targets. Warn out from writing all of this and hopeful that someone other than me actually cares, I wish you all well today/tonight! - Jim