Re: DNS resolution problem on 1.6.1-1ppa1~trusty
Just noticed the changes have been backported to 1.6, that's great going. Thanks Baptiste & Willy! On 30 October 2015 at 14:52, Ben Tisdall wrote: > On Fri, Oct 30, 2015 at 2:48 PM, Baptiste wrote: >> On Fri, Oct 30, 2015 at 2:10 PM, Lukas Tribus wrote: I sent patches to Willy, and they have been integrated a few minutes ago. You can git pull ; make clean ; make [...] >>> >>> Unless you use haproxy-1.6, in that case you have to wait for the backport >>> and the git push, which has not happened yet. >>> >> True :) >> I'm cutting edge: "HAProxy version 1.7-dev0-e4c4b7-18". >> > > We use Vincent Bernat's PPA so we'll need to wait on that too. > Comparing the upstream and debian changelogs he seems pretty close > behind you folks though :) > > -- > Ben > -- -Ben
Re: DNS resolution problem on 1.6.1-1ppa1~trusty
On Fri, Oct 30, 2015 at 2:48 PM, Baptiste wrote: > On Fri, Oct 30, 2015 at 2:10 PM, Lukas Tribus wrote: >>> I sent patches to Willy, and they have been integrated a few minutes ago. >>> You can git pull ; make clean ; make [...] >> >> Unless you use haproxy-1.6, in that case you have to wait for the backport >> and the git push, which has not happened yet. >> > True :) > I'm cutting edge: "HAProxy version 1.7-dev0-e4c4b7-18". > We use Vincent Bernat's PPA so we'll need to wait on that too. Comparing the upstream and debian changelogs he seems pretty close behind you folks though :) -- Ben
Re: DNS resolution problem on 1.6.1-1ppa1~trusty
On Fri, Oct 30, 2015 at 2:10 PM, Lukas Tribus wrote: >> I sent patches to Willy, and they have been integrated a few minutes ago. >> You can git pull ; make clean ; make [...] > > Unless you use haproxy-1.6, in that case you have to wait for the backport > and the git push, which has not happened yet. > > Lukas True :) I'm cutting edge: "HAProxy version 1.7-dev0-e4c4b7-18". Baptiste
RE: DNS resolution problem on 1.6.1-1ppa1~trusty
> I sent patches to Willy, and they have been integrated a few minutes ago. > You can git pull ; make clean ; make [...] Unless you use haproxy-1.6, in that case you have to wait for the backport and the git push, which has not happened yet. Lukas
Re: DNS resolution problem on 1.6.1-1ppa1~trusty
On Fri, Oct 30, 2015 at 12:53 PM, Ben Tisdall wrote: > On Thu, Oct 29, 2015 at 1:43 PM, Ben Tisdall wrote: > >> Sorry, I'm misinterpreting the test results, please ignore that. One >> ELB address has remained the same today so it's likely HAProxy has >> been using that and has not needed to update. > > Ok, finally observed some more ELB address changes (2, the other > may've escaped me somehow): > > Resolvers section aws > nameserver aws_0: > sent: 18528 > valid: 18527 > update: 3 > cname: 0 > cname_error: 0 > any_err: 0 > nx: 0 > timeout: 0 > refused: 0 > other: 0 > invalid: 0 > too_big: 0 > truncated: 0 > outdated: 1 > > Proxy is proxying. > > -- > Ben Hi Ben, Thanks a lot for confirming! I managed to run it in my lab as well a couple of hours ago to confirm the problem is fixed. I sent patches to Willy, and they have been integrated a few minutes ago. You can git pull ; make clean ; make [...] Baptiste
Re: DNS resolution problem on 1.6.1-1ppa1~trusty
On Thu, Oct 29, 2015 at 1:43 PM, Ben Tisdall wrote: > Sorry, I'm misinterpreting the test results, please ignore that. One > ELB address has remained the same today so it's likely HAProxy has > been using that and has not needed to update. Ok, finally observed some more ELB address changes (2, the other may've escaped me somehow): Resolvers section aws nameserver aws_0: sent: 18528 valid: 18527 update: 3 cname: 0 cname_error: 0 any_err: 0 nx: 0 timeout: 0 refused: 0 other: 0 invalid: 0 too_big: 0 truncated: 0 outdated: 1 Proxy is proxying. -- Ben
Re: DNS resolution problem on 1.6.1-1ppa1~trusty
On Wed, Oct 28, 2015 at 4:41 PM, Baptiste wrote: > So, when you write >if (cname && memcmp(ptr, cname, cnamelen)) >return DNS_UPD_NAME_ERROR; > else if (memcmp(ptr, dn_name, dn_name_len)) > return DNS_UPD_NAME_ERROR; > > your compare cname againt name in current record only if cname is set. > In Ben's case, cname is set and ptr and cname comparison was true, > hence memcmp returned 0. > Since memcmp returns 0, then HAProxy checks the next condition and > compare ptr to dn_name, which lead to return the DNS_UPD_NAME_ERROR > since we're evaluating a cname and ptr points to the CNAME while > dn_name points to the queried name. > > Basically, the code parsed the first response record, the CNAME, then > returned an error because the value of the cname does not match > anymore the name in the A record. > > With the code below, when cname is set, there is no chance you compare > ptr and dn_name... >if (cname) { > if (memcmp(ptr, cname, cnamelen)) { >return DNS_UPD_NAME_ERROR; >} >} > else if (memcmp(ptr, dn_name, dn_name_len)) > return DNS_UPD_NAME_ERROR; Thank you for the careful explanation Baptiste, that riddle was confounding our understanding.
Re: DNS resolution problem on 1.6.1-1ppa1~trusty
On Thu, Oct 29, 2015 at 1:40 PM, Ben Tisdall wrote: > Ok, testing with the latest > 0001-BUG-MAJOR-dns-first-DNS-response-packet-not-matching.patch > appears to work from the proxy POV but I'm not seeing the update > counter incrementing on address changes. Sorry, I'm misinterpreting the test results, please ignore that. One ELB address has remained the same today so it's likely HAProxy has been using that and has not needed to update. -- Ben
Re: DNS resolution problem on 1.6.1-1ppa1~trusty
Ok, testing with the latest 0001-BUG-MAJOR-dns-first-DNS-response-packet-not-matching.patch appears to work from the proxy POV but I'm not seeing the update counter incrementing on address changes.
Re: DNS resolution problem on 1.6.1-1ppa1~trusty
On Wed, Oct 28, 2015 at 7:04 PM, Jesse Hathaway wrote: > On Wed, Oct 28, 2015 at 12:00 PM, Baptiste wrote: >> Good catch, forget about patch 1, It was 2AM in the morning when I >> wrote it :'(... >> I wanted to apply the same code as DNS_UPD_NO_IP_FOUND, and increment >> the OTHER error. > > That is interesting, but I was asking about the second patch, > 0002-BUG-MINOR-dns-unable-to-parse-CNAMEs-response.patch Ah, ok! Anyway, your mail made me read my patches and find the ugly thing in the other patch :) So, when you write if (cname && memcmp(ptr, cname, cnamelen)) return DNS_UPD_NAME_ERROR; else if (memcmp(ptr, dn_name, dn_name_len)) return DNS_UPD_NAME_ERROR; your compare cname againt name in current record only if cname is set. In Ben's case, cname is set and ptr and cname comparison was true, hence memcmp returned 0. Since memcmp returns 0, then HAProxy checks the next condition and compare ptr to dn_name, which lead to return the DNS_UPD_NAME_ERROR since we're evaluating a cname and ptr points to the CNAME while dn_name points to the queried name. Basically, the code parsed the first response record, the CNAME, then returned an error because the value of the cname does not match anymore the name in the A record. With the code below, when cname is set, there is no chance you compare ptr and dn_name... if (cname) { if (memcmp(ptr, cname, cnamelen)) { return DNS_UPD_NAME_ERROR; } } else if (memcmp(ptr, dn_name, dn_name_len)) return DNS_UPD_NAME_ERROR; Baptiste
Re: DNS resolution problem on 1.6.1-1ppa1~trusty
On Wed, Oct 28, 2015 at 6:39 PM, Ben Tisdall wrote: > On Wed, Oct 28, 2015 at 6:28 PM, Ben Tisdall wrote: >> On Wed, Oct 28, 2015 at 6:00 PM, Baptiste wrote: >>> >>> Ben, could you apply the patch below instead of 0001: >>> >>> [snip] > > That patch is proving problematic to apply, to save me guessing can > you provide it as an attachment please. Hi Ben, Here you go. Baptiste From c96ec88f274689f5dd5b7efd403fccbc8837e748 Mon Sep 17 00:00:00 2001 From: Baptiste Assmann Date: Wed, 28 Oct 2015 02:03:32 +0100 Subject: [PATCH 1/2] BUG/MAJOR: dns: first DNS response packet not matching queried hostname may lead to a loop The status DNS_UPD_NAME_ERROR returned by dns_get_ip_from_response and which means the queried name can't be found in the response was improperly processed (fell into the default case). This lead to a loop where HAProxy simply resend a new query as soon as it got a response for this status and in the only case where such type of response is the very first one received by the process. This should be backported into 1.6 branch --- src/server.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/server.c b/src/server.c index dcc5961..c92623d 100644 --- a/src/server.c +++ b/src/server.c @@ -2620,6 +2620,17 @@ int snr_resolution_cb(struct dns_resolution *resolution, struct dns_nameserver * } goto stop_resolution; + case DNS_UPD_NAME_ERROR: + /* if this is not the last expected response, we ignore it */ + if (resolution->nb_responses < nameserver->resolvers->count_nameservers) +return 0; + /* update resolution status to OTHER error type */ + if (resolution->status != RSLV_STATUS_OTHER) { +resolution->status = RSLV_STATUS_OTHER; +resolution->last_status_change = now_ms; + } + goto stop_resolution; + default: goto invalid; -- 2.5.0
Re: DNS resolution problem on 1.6.1-1ppa1~trusty
On Wed, Oct 28, 2015 at 12:00 PM, Baptiste wrote: > Good catch, forget about patch 1, It was 2AM in the morning when I > wrote it :'(... > I wanted to apply the same code as DNS_UPD_NO_IP_FOUND, and increment > the OTHER error. That is interesting, but I was asking about the second patch, 0002-BUG-MINOR-dns-unable-to-parse-CNAMEs-response.patch From c5f95cda9cf66db99d6088af4ecf82568a4602b4 Mon Sep 17 00:00:00 2001 From: Baptiste Assmann Date: Wed, 28 Oct 2015 02:10:02 +0100 Subject: [PATCH 2/2] BUG/MINOR: dns: unable to parse CNAMEs response A bug lied in the parsing of DNS CNAME response, leading HAProxy to think the CNAME was improperly resolved in the response. This should be backported into 1.6 branch --- src/dns.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/dns.c b/src/dns.c index 53b65ab..e28e2a9 100644 --- a/src/dns.c +++ b/src/dns.c @@ -628,8 +628,11 @@ int dns_get_ip_from_response(unsigned char *resp, unsigned char *resp_end, else ptr = reader; - if (cname && memcmp(ptr, cname, cnamelen)) - return DNS_UPD_NAME_ERROR; + if (cname) { + if (memcmp(ptr, cname, cnamelen)) { +return DNS_UPD_NAME_ERROR; + } + } else if (memcmp(ptr, dn_name, dn_name_len)) return DNS_UPD_NAME_ERROR; -- 2.5.0
Re: DNS resolution problem on 1.6.1-1ppa1~trusty
On Wed, Oct 28, 2015 at 6:28 PM, Ben Tisdall wrote: > On Wed, Oct 28, 2015 at 6:00 PM, Baptiste wrote: >> >> Ben, could you apply the patch below instead of 0001: >> >> [snip] That patch is proving problematic to apply, to save me guessing can you provide it as an attachment please.
Re: DNS resolution problem on 1.6.1-1ppa1~trusty
On Wed, Oct 28, 2015 at 6:00 PM, Baptiste wrote: > > Ben, could you apply the patch below instead of 0001: > > [snip] Sure, will report back in the morning. Thanks Jesse and Baptiste :) Ben
Re: DNS resolution problem on 1.6.1-1ppa1~trusty
Jesse On Wed, Oct 28, 2015 at 5:25 PM, Jesse Hathaway wrote: > On Tue, Oct 27, 2015 at 8:18 PM, Baptiste wrote: >> #2 an error in the way we parse CNAME responses, leading to return an >> error when validating a CNAME (this triggers bug #1). > > How does your patch for this issue change the logic? It appears > functionally the same to me. Good catch, forget about patch 1, It was 2AM in the morning when I wrote it :'(... I wanted to apply the same code as DNS_UPD_NO_IP_FOUND, and increment the OTHER error. Actually, the bug was triggered because the status of the resolution was never updated in this very particular case (first DNS response, can't find requested name in the response), which lead the code to resend a packet, creating a loop. Ben, could you apply the patch below instead of 0001: diff --git a/src/server.c b/src/server.c index dcc5961..c92623d 100644 --- a/src/server.c +++ b/src/server.c @@ -2620,6 +2620,17 @@ int snr_resolution_cb(struct dns_resolution *resolution, struct dns_nameserver * } goto stop_resolution; + case DNS_UPD_NAME_ERROR: + /* if this is not the last expected response, we ignore it */ + if (resolution->nb_responses < nameserver->resolvers->count_nameservers) + return 0; + /* update resolution status to OTHER error type */ + if (resolution->status != RSLV_STATUS_OTHER) { + resolution->status = RSLV_STATUS_OTHER; + resolution->last_status_change = now_ms; + } + goto stop_resolution; + default: goto invalid; I'll also test it in our amazon lab later tonight. Then I'll ask Willy to merge them. Jesse, thanks again for catching this! Baptiste
Re: DNS resolution problem on 1.6.1-1ppa1~trusty
On Tue, Oct 27, 2015 at 8:18 PM, Baptiste wrote: > #2 an error in the way we parse CNAME responses, leading to return an > error when validating a CNAME (this triggers bug #1). How does your patch for this issue change the logic? It appears functionally the same to me.
Re: DNS resolution problem on 1.6.1-1ppa1~trusty
On Wed, Oct 28, 2015 at 4:29 PM, Baptiste wrote: > Great, thanks for confirming! > Thanks for getting this sorted out Baptiste! Any idea of when the fixes would be likely to be released and make it into the ppa? -- Ben
Re: DNS resolution problem on 1.6.1-1ppa1~trusty
Great, thanks for confirming! Baptiste On Wed, Oct 28, 2015 at 4:13 PM, Ben Tisdall wrote: > On Wed, Oct 28, 2015 at 3:04 PM, Baptiste wrote: >> Now, you can simply use whatever tool (ab, httperf, wrk, etc...) >> hosted on a third party VM to inject traffic on ELB IP directly. >> After a few minutes (less than 5), ELB service will be moved >> automatically to an other instance, leading IP to change. >> On HAProxy stat socket, you should see the 'update' counter to be >> incremented to 1. >> Of course, traffic load-balanced by HAProxy should followup as well. >> > > Ok, I forced an address change as you described (good tip btw) and > sure enough the "update" counter incremented by 1 and the proxy > continued to function. > > -- > Ben
Re: DNS resolution problem on 1.6.1-1ppa1~trusty
On Wed, Oct 28, 2015 at 3:04 PM, Baptiste wrote: > Now, you can simply use whatever tool (ab, httperf, wrk, etc...) > hosted on a third party VM to inject traffic on ELB IP directly. > After a few minutes (less than 5), ELB service will be moved > automatically to an other instance, leading IP to change. > On HAProxy stat socket, you should see the 'update' counter to be > incremented to 1. > Of course, traffic load-balanced by HAProxy should followup as well. > Ok, I forced an address change as you described (good tip btw) and sure enough the "update" counter incremented by 1 and the proxy continued to function. -- Ben
Re: DNS resolution problem on 1.6.1-1ppa1~trusty
Now, you can simply use whatever tool (ab, httperf, wrk, etc...) hosted on a third party VM to inject traffic on ELB IP directly. After a few minutes (less than 5), ELB service will be moved automatically to an other instance, leading IP to change. On HAProxy stat socket, you should see the 'update' counter to be incremented to 1. Of course, traffic load-balanced by HAProxy should followup as well. Baptiste On Wed, Oct 28, 2015 at 2:05 PM, Ben Tisdall wrote: > On Wed, Oct 28, 2015 at 1:55 PM, Baptiste wrote: > >> >> Have you forced resolution to ipv4 only? >> if not, could you give it a try? >> > > Right, with "resolver-prefer ipv4": > > Resolvers section aws > nameserver aws_0: > sent: 11 > valid: 11 > update: 0 > cname: 0 > cname_error: 0 > any_err: 0 > nx: 0 > timeout: 0 > refused: 0 > other: 0 > invalid: 0 > too_big: 0 > truncated: 0 > outdated: 0
Re: DNS resolution problem on 1.6.1-1ppa1~trusty
On Wed, Oct 28, 2015 at 1:55 PM, Baptiste wrote: > > Have you forced resolution to ipv4 only? > if not, could you give it a try? > Right, with "resolver-prefer ipv4": Resolvers section aws nameserver aws_0: sent: 11 valid: 11 update: 0 cname: 0 cname_error: 0 any_err: 0 nx: 0 timeout: 0 refused: 0 other: 0 invalid: 0 too_big: 0 truncated: 0 outdated: 0
Re: DNS resolution problem on 1.6.1-1ppa1~trusty
On Wed, Oct 28, 2015 at 11:36 AM, Ben Tisdall wrote: > On Wed, Oct 28, 2015 at 10:15 AM, Ben Tisdall > wrote: >> >> Thanks Baptiste, will get on this today. >> > > Ok this in the test environment now and the "other" counter now > increments in step with "valid", eg: > > Resolvers section aws > nameserver aws_0: > sent: 208 > valid: 104 > update: 0 > cname: 0 > cname_error: 0 > any_err: 0 > nx: 0 > timeout: 0 > refused: 0 > other: 104 > invalid: 0 > too_big: 0 > truncated: 0 > outdated: 0 > > We'll get some (system-wide) load and regression testing done. > > -- > Ben Have you forced resolution to ipv4 only? if not, could you give it a try? Baptiste
Re: DNS resolution problem on 1.6.1-1ppa1~trusty
On Wed, Oct 28, 2015 at 10:15 AM, Ben Tisdall wrote: > > Thanks Baptiste, will get on this today. > Ok this in the test environment now and the "other" counter now increments in step with "valid", eg: Resolvers section aws nameserver aws_0: sent: 208 valid: 104 update: 0 cname: 0 cname_error: 0 any_err: 0 nx: 0 timeout: 0 refused: 0 other: 104 invalid: 0 too_big: 0 truncated: 0 outdated: 0 We'll get some (system-wide) load and regression testing done. -- Ben
Re: DNS resolution problem on 1.6.1-1ppa1~trusty
On Wed, Oct 28, 2015 at 2:18 AM, Baptiste wrote: > Ben, > > I found a couple of bugs: > #1 an incomplete end of processing when the queried hostname can't be > found in the response. This lead to the query loop you may have > observed. > #2 an error in the way we parse CNAME responses, leading to return an > error when validating a CNAME (this triggers bug #1). > > Please find in attachment a couple of patches you could give a try and > report whether you still have an issue or not. > Thanks Baptiste, will get on this today. -- Ben
Re: DNS resolution problem on 1.6.1-1ppa1~trusty
Ben, I found a couple of bugs: #1 an incomplete end of processing when the queried hostname can't be found in the response. This lead to the query loop you may have observed. #2 an error in the way we parse CNAME responses, leading to return an error when validating a CNAME (this triggers bug #1). Please find in attachment a couple of patches you could give a try and report whether you still have an issue or not. Baptiste From 67687363df5e2b5c82f12ecf2c560d22f9da795c Mon Sep 17 00:00:00 2001 From: Baptiste Assmann Date: Wed, 28 Oct 2015 02:03:32 +0100 Subject: [PATCH 1/2] BUG/MAJOR: dns: DNS response packet not matching queried hostname may lead to a loop The status DNS_UPD_NAME_ERROR returned by dns_get_ip_from_response and which means the queried name can't be found in the response was improperly processed (fell into the default case). This lead to a loop where HAProxy simply resend a new query as soon as it got a response for this status This should be backported into 1.6 branch --- src/server.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/server.c b/src/server.c index dcc5961..0e0cab3 100644 --- a/src/server.c +++ b/src/server.c @@ -2603,6 +2603,7 @@ int snr_resolution_cb(struct dns_resolution *resolution, struct dns_nameserver * } goto stop_resolution; + case DNS_UPD_NAME_ERROR: case DNS_UPD_SRVIP_NOT_FOUND: goto save_ip; -- 2.5.0 From c5f95cda9cf66db99d6088af4ecf82568a4602b4 Mon Sep 17 00:00:00 2001 From: Baptiste Assmann Date: Wed, 28 Oct 2015 02:10:02 +0100 Subject: [PATCH 2/2] BUG/MINOR: dns: unable to parse CNAMEs response A bug lied in the parsing of DNS CNAME response, leading HAProxy to think the CNAME was improperly resolved in the response. This should be backported into 1.6 branch --- src/dns.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/dns.c b/src/dns.c index 53b65ab..e28e2a9 100644 --- a/src/dns.c +++ b/src/dns.c @@ -628,8 +628,11 @@ int dns_get_ip_from_response(unsigned char *resp, unsigned char *resp_end, else ptr = reader; - if (cname && memcmp(ptr, cname, cnamelen)) - return DNS_UPD_NAME_ERROR; + if (cname) { + if (memcmp(ptr, cname, cnamelen)) { +return DNS_UPD_NAME_ERROR; + } + } else if (memcmp(ptr, dn_name, dn_name_len)) return DNS_UPD_NAME_ERROR; -- 2.5.0
Re: DNS resolution problem on 1.6.1-1ppa1~trusty
On Wed, Oct 28, 2015 at 12:13 AM, Baptiste wrote: > On Tue, Oct 27, 2015 at 11:44 AM, Ben Tisdall > wrote: >> Hi and thanks for a great load balancer. We're developing a much more >> complex proxy ruleset and being able to switch back to haproxy now >> that it supports DNS resolution was a huge relief! >> >> Unfortunately DNS resolution is not doing what I expect given the >> configuration. When the downstream ELB to which the server points to >> switches IP addresses the backend is failing with a L4 timeout on the >> check. DNS queries are being made, see: >> https://gist.github.com/btisdall/31b57b57fee19dc79637 >> >> This is the output of "show stat resolvers": >> >> Resolvers section aws >> nameserver aws_0: >> sent: 2892976 >> valid: 2887729 >> update: 0 >> cname: 0 >> cname_error: 0 >> any_err: 0 >> nx: 0 >> timeout: 0 >> refused: 0 >> other: 0 >> invalid: 2887729 >> too_big: 0 >> truncated: 0 >> outdated: 0 >> >> Note that "valid" and "invalid" counts increase in exact step. >> Switching to "resolve-prefer ipv4" had no effect on this. >> >> Config >> = >> >> resolvers aws >> nameserver aws_0 10.111.0.2:53 >> >> # ... >> >> server myserver some-server.example.com:80 check resolvers aws >> >> Build Options >> == >> >> HA-Proxy version 1.6.1 2015/10/20 >> Copyright 2000-2015 Willy Tarreau >> >> Build options : >> TARGET = linux2628 >> CPU = generic >> CC = gcc >> CFLAGS = -g -O2 -fstack-protector --param=ssp-buffer-size=4 >> -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 >> OPTIONS = USE_ZLIB=1 USE_OPENSSL=1 USE_LUA=1 USE_PCRE=1 >> >> Default settings : >> maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 >> >> Encrypted password support via crypt(3): yes >> Built with zlib version : 1.2.8 >> Compression algorithms supported : identity("identity"), >> deflate("deflate"), raw-deflate("deflate"), gzip("gzip") >> Built with OpenSSL version : OpenSSL 1.0.1f 6 Jan 2014 >> Running on OpenSSL version : OpenSSL 1.0.1f 6 Jan 2014 >> OpenSSL library supports TLS extensions : yes >> OpenSSL library supports SNI : yes >> OpenSSL library supports prefer-server-ciphers : yes >> Built with PCRE version : 8.31 2012-07-06 >> PCRE library supports JIT : no (USE_PCRE_JIT not set) >> Built with Lua version : Lua 5.3.1 >> Built with transparent proxy support using: IP_TRANSPARENT >> IPV6_TRANSPARENT IP_FREEBIND >> >> Available polling systems : >> epoll : pref=300, test result OK >>poll : pref=200, test result OK >> select : pref=150, test result OK >> Total: 3 (3 usable), will use epoll. >> >> Regards, >> >> -- >> Ben >> > > > Hi Ben, > > I can't reproduce the problem with git version. > I'll try with 1.6.1, but DNS code is supposed to be the same between > both versions for now. > > I've setup the following amazon lab: > - 1 instance with HAProxy running poininting to 1 ELB > - 1 ELB instance taking traffic from haproxy above above and > load-balancing haproxy's stats page from above server > - 1 instance to inject traffic on ELB to force it to change its IP > address after a few minutes > > HTTP stream is like: public > haproxy:8080 > elb:80 > haproxy:80 > It works like a charm. > I triggered a DNS change on ELB by massiveley injecting traffic and > here is the output of DNS stats: > > Resolvers section aws > nameserver aws1: > sent: 95 > valid: 95 > update: 1 > cname: 0 > cname_error: 0 > any_err: 0 > nx: 0 > timeout: 0 > refused: 0 > other: 0 > invalid: 0 > too_big: 0 > truncated: 0 > outdated: 0 > > > Here is my configuration: > > global > daemon > log 127.0.0.1:514 local0 info > stats socket /tmp/socket level admin > stats timeout 10m > > resolvers aws > nameserver aws1 172.31.0.2:53 > > defaults HTTP > mode http > timeout client 10s > timeout connect 4s > timeout server 10s > > frontend f > bind :8080 > default_backend b > > backend b > server s ${LBNAME}:80 check resolvers aws resolve-prefer ipv4 > > frontend s > bind :80 > stats enable > stats uri /stats > stats show-legends > http-request redirect location /stats if { path / } > > > > Please take a real pcap file using tcpdump and send it to me privately. > > You also seem to use a CNAME which points to your ELB amazon name. > Could you let me know how you setup this, so I can try to reproduce > the issue in my lab? > > Maybe the CNAME parsing is broken. > > Baptiste Ok, I use my personal domain name to create a CNAME pointing to my internal ELB name and I can now reproduce the problem: Resolvers section aws nameserver aws1: sent: 10485 valid: 10469 update: 0 cname: 0 cname_error: 0 any_err: 0 nx: 12 timeout: 0 refused: 0 other: 0 invalid: 10469 too_big: 0 truncated: 0 outdated: 0 Now, let's dig in there :) Baptiste
Re: DNS resolution problem on 1.6.1-1ppa1~trusty
On Tue, Oct 27, 2015 at 11:44 AM, Ben Tisdall wrote: > Hi and thanks for a great load balancer. We're developing a much more > complex proxy ruleset and being able to switch back to haproxy now > that it supports DNS resolution was a huge relief! > > Unfortunately DNS resolution is not doing what I expect given the > configuration. When the downstream ELB to which the server points to > switches IP addresses the backend is failing with a L4 timeout on the > check. DNS queries are being made, see: > https://gist.github.com/btisdall/31b57b57fee19dc79637 > > This is the output of "show stat resolvers": > > Resolvers section aws > nameserver aws_0: > sent: 2892976 > valid: 2887729 > update: 0 > cname: 0 > cname_error: 0 > any_err: 0 > nx: 0 > timeout: 0 > refused: 0 > other: 0 > invalid: 2887729 > too_big: 0 > truncated: 0 > outdated: 0 > > Note that "valid" and "invalid" counts increase in exact step. > Switching to "resolve-prefer ipv4" had no effect on this. > > Config > = > > resolvers aws > nameserver aws_0 10.111.0.2:53 > > # ... > > server myserver some-server.example.com:80 check resolvers aws > > Build Options > == > > HA-Proxy version 1.6.1 2015/10/20 > Copyright 2000-2015 Willy Tarreau > > Build options : > TARGET = linux2628 > CPU = generic > CC = gcc > CFLAGS = -g -O2 -fstack-protector --param=ssp-buffer-size=4 > -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 > OPTIONS = USE_ZLIB=1 USE_OPENSSL=1 USE_LUA=1 USE_PCRE=1 > > Default settings : > maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200 > > Encrypted password support via crypt(3): yes > Built with zlib version : 1.2.8 > Compression algorithms supported : identity("identity"), > deflate("deflate"), raw-deflate("deflate"), gzip("gzip") > Built with OpenSSL version : OpenSSL 1.0.1f 6 Jan 2014 > Running on OpenSSL version : OpenSSL 1.0.1f 6 Jan 2014 > OpenSSL library supports TLS extensions : yes > OpenSSL library supports SNI : yes > OpenSSL library supports prefer-server-ciphers : yes > Built with PCRE version : 8.31 2012-07-06 > PCRE library supports JIT : no (USE_PCRE_JIT not set) > Built with Lua version : Lua 5.3.1 > Built with transparent proxy support using: IP_TRANSPARENT > IPV6_TRANSPARENT IP_FREEBIND > > Available polling systems : > epoll : pref=300, test result OK >poll : pref=200, test result OK > select : pref=150, test result OK > Total: 3 (3 usable), will use epoll. > > Regards, > > -- > Ben > Hi Ben, I can't reproduce the problem with git version. I'll try with 1.6.1, but DNS code is supposed to be the same between both versions for now. I've setup the following amazon lab: - 1 instance with HAProxy running poininting to 1 ELB - 1 ELB instance taking traffic from haproxy above above and load-balancing haproxy's stats page from above server - 1 instance to inject traffic on ELB to force it to change its IP address after a few minutes HTTP stream is like: public > haproxy:8080 > elb:80 > haproxy:80 It works like a charm. I triggered a DNS change on ELB by massiveley injecting traffic and here is the output of DNS stats: Resolvers section aws nameserver aws1: sent: 95 valid: 95 update: 1 cname: 0 cname_error: 0 any_err: 0 nx: 0 timeout: 0 refused: 0 other: 0 invalid: 0 too_big: 0 truncated: 0 outdated: 0 Here is my configuration: global daemon log 127.0.0.1:514 local0 info stats socket /tmp/socket level admin stats timeout 10m resolvers aws nameserver aws1 172.31.0.2:53 defaults HTTP mode http timeout client 10s timeout connect 4s timeout server 10s frontend f bind :8080 default_backend b backend b server s ${LBNAME}:80 check resolvers aws resolve-prefer ipv4 frontend s bind :80 stats enable stats uri /stats stats show-legends http-request redirect location /stats if { path / } Please take a real pcap file using tcpdump and send it to me privately. You also seem to use a CNAME which points to your ELB amazon name. Could you let me know how you setup this, so I can try to reproduce the issue in my lab? Maybe the CNAME parsing is broken. Baptiste