Hello, Since I did not get any responses on this, I decided to try motivating a reponse by attempting an implementation. I am attaching a patch that does this. Admittedly this patch is an iteration and I am not submitting it for anything more than receiving feedback, on the requirement, alternative ideas and the implementation.
Following is an explanation I added an option httpchksrv which takes an ipv4/6 address (external health checker) and an option http header. The http header is used to communicate to the health check server the backend server to check. option httpchk GET /_health.php HTTP/1.1 option httpchksrv <ipv4|ipv6> [header <http-header-name=X-Check-For>] Next, I added a "header-value" specification to the server definition server a1 magic.tumblr.com:80 weight 20 maxconn 5 check inter 2s header-value magic.tumblr.com the header-value is used for the http-header-name specified in httpchksrv Here is an example of the health check request GET /_health.php HTTP/1.1 X-Check-For: magic.tumblr.com The default value of header-value is the server id, in this case 'a1' The following is a little abstract and describes how health checks can be cached using this change, please bear with my attempts to describe it, these may be in-adequate. Please take this for what it is, broad strokes of an idea. I am not in any way advocating for this deployment. Going back to my original motivation "excessive health checks due to increasing proxy and web application deployment", here is a description of how I can solve it using this implementation. On haproxy I define 2 frontend, one on port 80 and one on port 6777. The httpchksrv specification is used to direct health checks back to haproxy on port 6777. With haproxy in http mode option httpchksrv 127.0.0.1:6777 Each server specification on the backend for port 80 (production traffic) uses a server specification as server a1 server:80 weight 20 maxconn 5 check inter 2s I define a backend of varnish nodes to use with the front end on port 6777. I also make sure that the varnish backend uses only L4 health checks. Health check are passed to varnish from all the proxies consistently hashed on the http header X-Check-For via their front end on port 6777. Varnish vcl is used to obtain the header value 'X-Check-For' and make a health check request to the appropriate web host if required, it may return cached health check responses according the configured TTL. Thanks Bhaskar On Fri, Jan 31, 2014 at 1:46 PM, Bhaskar Maddala <madda...@gmail.com> wrote: > Hello, > > As the number of haproxy deployments (>20) grows in our infrastructure > along > with an increase in the number of backends ~1500 we are beginning to > see a non trivial resources allocated to health checks. Each proxy instance > health checking each backend every 2 seconds. > > In an earlier conversation with Willy I was directed to look into the > options > fastinter and on-error configuration options. I have done this but wanted to > speak about how others might have addressed this and if there was any > interest in implementing something along these lines and gather ideas/comments > on what such an implementation would look like. > > We use haproxy as a http load balancer and I have not given any thought > about how the following description applies to tcp mode. > > Currently we http check our backends using > > option httpchk GET /_check.php HTTP/1.1\r\nHost:\ www.domain.com > > We were considering adding an additional directive to specify a check server > in addition to the httpchk directive > > option httpchk GET /_health.php HTTP/1.1\r\nHost:\ hdr(Host) > option chksrv server hcm-008dad0f 172.16.114.52:80 > > The change would add a dynamic field to the health check request. > hdr(Host) (http host header in this instance) is the field used to communicate > the server to be health checked to the external check server. > > The check server can/will be implemented to cache health check responses from > the back ends. > > One of the justifications for implementing this is the need in my > environment to take > into consideration factors not available to the backends when > responding to a health > check. As an example we will be implementing in our check server > ability to force > success/failure of health checks on groups of backends related in some manner. > We expect this to allow us to avoid brown out scenarios we have > encountered in the past. > > Has anyone considered/achieved something along these lines, or have > suggestions > on how we could implement the same? > > Thanks > Bhaskar
From 914db3e485831e29e5b76bf3d276ce56442b498f Mon Sep 17 00:00:00 2001 From: Bhaskar Maddala <bhas...@tumblr.com> Date: Wed, 5 Feb 2014 23:58:36 -0500 Subject: [PATCH] Attempt at adding ability to externalize health check Summary: We add new option 'httpchksrv' which allows us to specify the server to use to health check backends. The backend to health check is communicated via a http header. The header value to be passed to the backend is specified in the server specification using the new keyword 'header-value' The default header is 'X-Check-Host' and the default value is the server id. --- include/common/defaults.h | 1 + include/types/proxy.h | 5 ++- include/types/server.h | 3 ++ src/cfgparse.c | 103 +++++++++++++++++++++++++++++++++++++++------- src/checks.c | 13 +++++- 5 files changed, 106 insertions(+), 19 deletions(-) diff --git a/include/common/defaults.h b/include/common/defaults.h index f765e90..dc0dd93 100644 --- a/include/common/defaults.h +++ b/include/common/defaults.h @@ -131,6 +131,7 @@ #define DEF_SMTP_CHECK_REQ "HELO localhost\r\n" #define DEF_LDAP_CHECK_REQ "\x30\x0c\x02\x01\x01\x60\x07\x02\x01\x03\x04\x00\x80\x00" #define DEF_REDIS_CHECK_REQ "*1\r\n$4\r\nPING\r\n" +#define DEF_CHECK_HOST_HDR "X-Check-For" #define DEF_HANA_ONERR HANA_ONERR_FAILCHK #define DEF_HANA_ERRLIMIT 10 diff --git a/include/types/proxy.h b/include/types/proxy.h index af2a3ab..12f82f5 100644 --- a/include/types/proxy.h +++ b/include/types/proxy.h @@ -338,7 +338,10 @@ struct proxy { int grace; /* grace time after stop request */ struct list tcpcheck_rules; /* tcp-check send / expect rules */ char *check_req; /* HTTP or SSL request to use for PR_O_HTTP_CHK|PR_O_SSL3_CHK */ - int check_len; /* Length of the HTTP or SSL3 request */ + int check_req_len; /* Length of the HTTP or SSL3 request */ + struct sockaddr_storage check_addr; /* the address to check */ + char *check_hdr_name; /* HTTP header used to identify host being checked */ + int check_hdr_name_len; /* Length of the HTTP header */ char *expect_str; /* http-check expected content : string or text version of the regex */ regex_t *expect_regex; /* http-check expected content */ struct chunk errmsg[HTTP_ERR_SIZE]; /* default or customized error messages for known errors */ diff --git a/include/types/server.h b/include/types/server.h index 54ab813..52b60a5 100644 --- a/include/types/server.h +++ b/include/types/server.h @@ -161,6 +161,9 @@ struct server { struct sockaddr_storage addr; /* the address to check, if different from <addr> */ } check_common; + char *check_hdr_val; /* http header value used for health checkes */ + int check_hdr_val_len; /* length of the http header value */ + struct check check; /* health-check specific configuration */ struct check agent; /* agent specific configuration */ diff --git a/src/cfgparse.c b/src/cfgparse.c index 9993c61..05be933 100644 --- a/src/cfgparse.c +++ b/src/cfgparse.c @@ -1841,12 +1841,19 @@ int cfg_parse_listen(const char *file, int linenum, char **args, int kwm) if (curproxy->cap & PR_CAP_BE) { curproxy->fullconn = defproxy.fullconn; curproxy->conn_retries = defproxy.conn_retries; + curproxy->check_addr = defproxy.check_addr; if (defproxy.check_req) { - curproxy->check_req = calloc(1, defproxy.check_len); - memcpy(curproxy->check_req, defproxy.check_req, defproxy.check_len); + curproxy->check_req = calloc(1, defproxy.check_req_len); + memcpy(curproxy->check_req, defproxy.check_req, defproxy.check_req_len); } - curproxy->check_len = defproxy.check_len; + curproxy->check_req_len = defproxy.check_req_len; + + if (defproxy.check_hdr_name) { + curproxy->check_hdr_name = calloc(1, defproxy.check_hdr_name_len); + memcpy(curproxy->check_hdr_name, defproxy.check_hdr_name, defproxy.check_hdr_name_len); + } + curproxy->check_hdr_name_len = defproxy.check_hdr_name_len; if (defproxy.expect_str) { curproxy->expect_str = strdup(defproxy.expect_str); @@ -1990,6 +1997,8 @@ int cfg_parse_listen(const char *file, int linenum, char **args, int kwm) free(defproxy.monitor_uri); free(defproxy.defbe.name); free(defproxy.conn_src.iface_name); + free(defproxy.check_hdr_name); + defproxy.check_hdr_name_len = 0; free(defproxy.fwdfor_hdr_name); defproxy.fwdfor_hdr_len = 0; free(defproxy.orgto_hdr_name); @@ -3631,11 +3640,11 @@ stats_error_parsing: curproxy->options2 |= PR_O2_HTTP_CHK; if (!*args[2]) { /* no argument */ curproxy->check_req = strdup(DEF_CHECK_REQ); /* default request */ - curproxy->check_len = strlen(DEF_CHECK_REQ); + curproxy->check_req_len = strlen(DEF_CHECK_REQ); } else if (!*args[3]) { /* one argument : URI */ int reqlen = strlen(args[2]) + strlen("OPTIONS HTTP/1.0\r\n") + 1; curproxy->check_req = (char *)malloc(reqlen); - curproxy->check_len = snprintf(curproxy->check_req, reqlen, + curproxy->check_req_len = snprintf(curproxy->check_req, reqlen, "OPTIONS %s HTTP/1.0\r\n", args[2]); /* URI to use */ } else { /* more arguments : METHOD URI [HTTP_VER] */ int reqlen = strlen(args[2]) + strlen(args[3]) + 3 + strlen("\r\n"); @@ -3645,10 +3654,64 @@ stats_error_parsing: reqlen += strlen("HTTP/1.0"); curproxy->check_req = (char *)malloc(reqlen); - curproxy->check_len = snprintf(curproxy->check_req, reqlen, + curproxy->check_req_len = snprintf(curproxy->check_req, reqlen, "%s %s %s\r\n", args[2], args[3], *args[4]?args[4]:"HTTP/1.0"); } } + else if (!strcmp(args[1], "httpchksrv")) { + if (warnifnotcap(curproxy, PR_CAP_BE, file, linenum, args[1], NULL)) + err_code |= ERR_WARN; + + /* use a external http check server instead of querying the server for health checks */ + if (!*args[2]) { + Alert("parsing [%s:%d]: '%s' expects an <ipv4|ipv6> address.\n", + file, linenum, args[1]); + err_code |= ERR_ALERT | ERR_FATAL; + goto out; + } + + struct sockaddr_storage *sk; + int port1, port2; + struct protocol *proto; + + sk = str2sa_range(args[2], &port1, &port2, &errmsg, NULL); + if (!sk) { + Alert("parsing [%s:%d] : '%s' : %s\n", + file, linenum, args[2], errmsg); + err_code |= ERR_ALERT | ERR_FATAL; + goto out; + } + + proto = protocol_by_family(sk->ss_family); + if (!proto || !proto->connect) { + Alert("parsing [%s:%d] : '%s %s' : connect() not supported for this address family.\n", + file, linenum, args[1], args[2]); + err_code |= ERR_ALERT | ERR_FATAL; + goto out; + } + + if (port1 != port2) { + Alert("parsing [%s:%d] : '%s' : port ranges and offsets are not allowed in '%s'\n", + file, linenum, args[1], args[2]); + err_code |= ERR_ALERT | ERR_FATAL; + goto out; + } + + curproxy->check_addr = *sk; + + if (!*args[3]) { /* no argument */ + curproxy->check_hdr_name = strdup(DEF_CHECK_HOST_HDR); + curproxy->check_hdr_name_len = strlen(DEF_CHECK_HOST_HDR); + } else if (*args[4] && !strcmp(args[3], "header")) { + curproxy->check_hdr_name = strdup(args[4]); + curproxy->check_hdr_name_len = strlen(args[4]); + } else { + Alert("parsing [%s:%d] : '%s' : valid http header is required when using hdr\n", + file, linenum, args[1]); + err_code |= ERR_ALERT | ERR_FATAL; + goto out; + } + } else if (!strcmp(args[1], "ssl-hello-chk")) { /* use SSLv3 CLIENT HELLO to check servers' health */ if (warnifnotcap(curproxy, PR_CAP_BE, file, linenum, args[1], NULL)) @@ -3668,18 +3731,18 @@ stats_error_parsing: if (!*args[2] || !*args[3]) { /* no argument or incomplete EHLO host */ curproxy->check_req = strdup(DEF_SMTP_CHECK_REQ); /* default request */ - curproxy->check_len = strlen(DEF_SMTP_CHECK_REQ); + curproxy->check_req_len = strlen(DEF_SMTP_CHECK_REQ); } else { /* ESMTP EHLO, or SMTP HELO, and a hostname */ if (!strcmp(args[2], "EHLO") || !strcmp(args[2], "HELO")) { int reqlen = strlen(args[2]) + strlen(args[3]) + strlen(" \r\n") + 1; curproxy->check_req = (char *)malloc(reqlen); - curproxy->check_len = snprintf(curproxy->check_req, reqlen, + curproxy->check_req_len = snprintf(curproxy->check_req, reqlen, "%s %s\r\n", args[2], args[3]); /* HELO hostname */ } else { /* this just hits the default for now, but you could potentially expand it to allow for other stuff though, it's unlikely you'd want to send anything other than an EHLO or HELO */ curproxy->check_req = strdup(DEF_SMTP_CHECK_REQ); /* default request */ - curproxy->check_len = strlen(DEF_SMTP_CHECK_REQ); + curproxy->check_req_len = strlen(DEF_SMTP_CHECK_REQ); } } } @@ -3726,7 +3789,7 @@ stats_error_parsing: free(curproxy->check_req); curproxy->check_req = packet; - curproxy->check_len = packet_len; + curproxy->check_req_len = packet_len; packet_len = htonl(packet_len); memcpy(packet, &packet_len, 4); @@ -3754,7 +3817,7 @@ stats_error_parsing: curproxy->check_req = (char *) malloc(sizeof(DEF_REDIS_CHECK_REQ) - 1); memcpy(curproxy->check_req, DEF_REDIS_CHECK_REQ, sizeof(DEF_REDIS_CHECK_REQ) - 1); - curproxy->check_len = sizeof(DEF_REDIS_CHECK_REQ) - 1; + curproxy->check_req_len = sizeof(DEF_REDIS_CHECK_REQ) - 1; } else if (!strcmp(args[1], "mysql-check")) { @@ -3803,7 +3866,7 @@ stats_error_parsing: free(curproxy->check_req); curproxy->check_req = (char *)calloc(1, reqlen); - curproxy->check_len = reqlen; + curproxy->check_req_len = reqlen; snprintf(curproxy->check_req, 4, "%c%c%c", ((unsigned char) packetlen & 0xff), @@ -3836,7 +3899,7 @@ stats_error_parsing: curproxy->check_req = (char *) malloc(sizeof(DEF_LDAP_CHECK_REQ) - 1); memcpy(curproxy->check_req, DEF_LDAP_CHECK_REQ, sizeof(DEF_LDAP_CHECK_REQ) - 1); - curproxy->check_len = sizeof(DEF_LDAP_CHECK_REQ) - 1; + curproxy->check_req_len = sizeof(DEF_LDAP_CHECK_REQ) - 1; } else if (!strcmp(args[1], "tcp-check")) { /* use raw TCPCHK send/expect to check servers' health */ @@ -4563,6 +4626,8 @@ stats_error_parsing: newsrv->state = SRV_RUNNING; /* early server setup */ newsrv->last_change = now.tv_sec; newsrv->id = strdup(args[1]); + newsrv->check_hdr_val = strdup(args[1]); + newsrv->check_hdr_val_len = strlen(args[1]); /* several ways to check the port component : * - IP => port=+0, relative (IPv4 only) @@ -4812,6 +4877,11 @@ stats_error_parsing: newsrv->check_common.addr = *sk; cur_arg += 2; } + else if (!strcmp(args[cur_arg], "header-value")) { + newsrv->check_hdr_val = strdup(args[cur_arg + 1]); + newsrv->check_hdr_val_len = strlen(args[cur_arg + 1]); + cur_arg += 2; + } else if (!strcmp(args[cur_arg], "port")) { newsrv->check.port = atol(args[cur_arg + 1]); cur_arg += 2; @@ -5258,6 +5328,7 @@ stats_error_parsing: #endif newsrv->check.send_proxy |= (newsrv->state & SRV_SEND_PROXY); } + /* try to get the port from check_core.addr if check.port not set */ if (!newsrv->check.port) newsrv->check.port = get_host_port(&newsrv->check_common.addr); @@ -7078,9 +7149,9 @@ out_uri_auth_compat: } if ((curproxy->options2 & PR_O2_CHK_ANY) == PR_O2_SSL3_CHK) { - curproxy->check_len = sizeof(sslv3_client_hello_pkt) - 1; - curproxy->check_req = (char *)malloc(curproxy->check_len); - memcpy(curproxy->check_req, sslv3_client_hello_pkt, curproxy->check_len); + curproxy->check_req_len = sizeof(sslv3_client_hello_pkt) - 1; + curproxy->check_req = (char *)malloc(curproxy->check_req_len); + memcpy(curproxy->check_req, sslv3_client_hello_pkt, curproxy->check_req_len); } /* ensure that cookie capture length is not too large */ diff --git a/src/checks.c b/src/checks.c index c3051aa..4fae04a 100644 --- a/src/checks.c +++ b/src/checks.c @@ -1254,7 +1254,7 @@ static void event_srv_chk_r(struct connection *conn) if (!done && check->bi->i < 5) goto wait_more_data; - if (s->proxy->check_len == 0) { // old mode + if (s->proxy->check_req_len == 0) { // old mode if (*(check->bi->data + 4) != '\xff') { /* We set the MySQL Version in description for information purpose * FIXME : it can be cool to use MySQL Version for other purpose, @@ -1539,7 +1539,13 @@ static struct task *process_chk(struct task *t) * its own strings. */ if (check->type && check->type != PR_O2_TCPCHK_CHK && !(check->state & CHK_ST_AGENT)) { - bo_putblk(check->bo, s->proxy->check_req, s->proxy->check_len); + + /* set up the http request with headers correctly */ + bo_putblk(check->bo, s->proxy->check_req, s->proxy->check_req_len); + bo_putblk(check->bo, s->proxy->check_hdr_name, s->proxy->check_hdr_name_len); + bo_putstr(check->bo, ": "); + bo_putblk(check->bo, s->check_hdr_val, s->check_hdr_val_len); + bo_putstr(check->bo, "\r\n"); /* we want to check if this host replies to HTTP or SSLv3 requests * so we'll send the request, and won't wake the checker up now. @@ -1569,6 +1575,9 @@ static struct task *process_chk(struct task *t) if (is_addr(&s->check_common.addr)) /* we'll connect to the check addr specified on the server */ conn->addr.to = s->check_common.addr; + else if (check->type == PR_O2_HTTP_CHK && is_addr(&s->proxy->check_addr)) + /* we will connect to the check addr specified on the proxy, only http checks*/ + conn->addr.to = s->proxy->check_addr; else /* we'll connect to the addr on the server */ conn->addr.to = s->addr; -- 1.8.3.4 (Apple Git-47)