Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-11-01 Thread Ben Tisdall
Just noticed the changes have been backported to 1.6, that's great going.

Thanks Baptiste & Willy!

On 30 October 2015 at 14:52, Ben Tisdall  wrote:
> On Fri, Oct 30, 2015 at 2:48 PM, Baptiste  wrote:
>> On Fri, Oct 30, 2015 at 2:10 PM, Lukas Tribus  wrote:
 I sent patches to Willy, and they have been integrated a few minutes ago.
 You can git pull ; make clean ; make [...]
>>>
>>> Unless you use haproxy-1.6, in that case you have to wait for the backport
>>> and the git push, which has not happened yet.
>>>
>> True :)
>> I'm cutting edge: "HAProxy version 1.7-dev0-e4c4b7-18".
>>
>
> We use Vincent Bernat's PPA so we'll need to wait on that too.
> Comparing the upstream and debian changelogs he seems pretty close
> behind you folks though :)
>
> --
> Ben
>



-- 
-Ben



Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-30 Thread Ben Tisdall
On Fri, Oct 30, 2015 at 2:48 PM, Baptiste  wrote:
> On Fri, Oct 30, 2015 at 2:10 PM, Lukas Tribus  wrote:
>>> I sent patches to Willy, and they have been integrated a few minutes ago.
>>> You can git pull ; make clean ; make [...]
>>
>> Unless you use haproxy-1.6, in that case you have to wait for the backport
>> and the git push, which has not happened yet.
>>
> True :)
> I'm cutting edge: "HAProxy version 1.7-dev0-e4c4b7-18".
>

We use Vincent Bernat's PPA so we'll need to wait on that too.
Comparing the upstream and debian changelogs he seems pretty close
behind you folks though :)

-- 
Ben



Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-30 Thread Baptiste
On Fri, Oct 30, 2015 at 2:10 PM, Lukas Tribus  wrote:
>> I sent patches to Willy, and they have been integrated a few minutes ago.
>> You can git pull ; make clean ; make [...]
>
> Unless you use haproxy-1.6, in that case you have to wait for the backport
> and the git push, which has not happened yet.
>
> Lukas


True :)
I'm cutting edge: "HAProxy version 1.7-dev0-e4c4b7-18".

Baptiste



RE: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-30 Thread Lukas Tribus
> I sent patches to Willy, and they have been integrated a few minutes ago.
> You can git pull ; make clean ; make [...]

Unless you use haproxy-1.6, in that case you have to wait for the backport
and the git push, which has not happened yet.

Lukas

  


Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-30 Thread Baptiste
On Fri, Oct 30, 2015 at 12:53 PM, Ben Tisdall  wrote:
> On Thu, Oct 29, 2015 at 1:43 PM, Ben Tisdall  wrote:
>
>> Sorry, I'm misinterpreting the test results, please ignore that. One
>> ELB address has remained the same today so it's likely HAProxy has
>> been using that and has not needed to update.
>
> Ok, finally observed some more ELB address changes (2, the other
> may've escaped me somehow):
>
> Resolvers section aws
>  nameserver aws_0:
>   sent: 18528
>   valid: 18527
>   update: 3
>   cname: 0
>   cname_error: 0
>   any_err: 0
>   nx: 0
>   timeout: 0
>   refused: 0
>   other: 0
>   invalid: 0
>   too_big: 0
>   truncated: 0
>   outdated: 1
>
> Proxy is proxying.
>
> --
> Ben


Hi Ben,

Thanks a lot for confirming!
I managed to run it in my lab as well a couple of hours ago to confirm
the problem is fixed.

I sent patches to Willy, and they have been integrated a few minutes ago.
You can git pull ; make clean ; make [...]

Baptiste



Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-30 Thread Ben Tisdall
On Thu, Oct 29, 2015 at 1:43 PM, Ben Tisdall  wrote:

> Sorry, I'm misinterpreting the test results, please ignore that. One
> ELB address has remained the same today so it's likely HAProxy has
> been using that and has not needed to update.

Ok, finally observed some more ELB address changes (2, the other
may've escaped me somehow):

Resolvers section aws
 nameserver aws_0:
  sent: 18528
  valid: 18527
  update: 3
  cname: 0
  cname_error: 0
  any_err: 0
  nx: 0
  timeout: 0
  refused: 0
  other: 0
  invalid: 0
  too_big: 0
  truncated: 0
  outdated: 1

Proxy is proxying.

-- 
Ben



Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-29 Thread Jesse Hathaway
On Wed, Oct 28, 2015 at 4:41 PM, Baptiste  wrote:
> So, when you write
>if (cname && memcmp(ptr, cname, cnamelen))
>return DNS_UPD_NAME_ERROR;
>   else if (memcmp(ptr, dn_name, dn_name_len))
> return DNS_UPD_NAME_ERROR;
>
> your compare cname againt name in current record only if cname is set.
> In Ben's case, cname is set and ptr and cname comparison was true,
> hence memcmp returned 0.
> Since memcmp returns 0, then HAProxy checks the next condition and
> compare ptr to dn_name, which lead to return the DNS_UPD_NAME_ERROR
> since we're evaluating a cname and ptr points to the CNAME while
> dn_name points to the queried name.
>
> Basically, the code parsed the first response record, the CNAME, then
> returned an error because the value of the cname does not match
> anymore the name in the A record.
>
> With the code below, when cname is set, there is no chance you compare
> ptr and dn_name...
>if (cname) {
>   if (memcmp(ptr, cname, cnamelen)) {
>return DNS_UPD_NAME_ERROR;
>}
>}
>   else if (memcmp(ptr, dn_name, dn_name_len))
> return DNS_UPD_NAME_ERROR;

Thank you for the careful explanation Baptiste, that riddle was confounding
our understanding.



Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-29 Thread Ben Tisdall
On Thu, Oct 29, 2015 at 1:40 PM, Ben Tisdall  wrote:
> Ok, testing with the latest
> 0001-BUG-MAJOR-dns-first-DNS-response-packet-not-matching.patch
> appears to work from the proxy POV but I'm not seeing the update
> counter incrementing on address changes.

Sorry, I'm misinterpreting the test results, please ignore that. One
ELB address has remained the same today so it's likely HAProxy has
been using that and has not needed to update.



-- 
Ben



Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-29 Thread Ben Tisdall
Ok, testing with the latest
0001-BUG-MAJOR-dns-first-DNS-response-packet-not-matching.patch
appears to work from the proxy POV but I'm not seeing the update
counter incrementing on address changes.



Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-28 Thread Baptiste
On Wed, Oct 28, 2015 at 7:04 PM, Jesse Hathaway  wrote:
> On Wed, Oct 28, 2015 at 12:00 PM, Baptiste  wrote:
>> Good catch, forget about patch 1, It was 2AM in the morning when I
>> wrote it :'(...
>> I wanted to apply the same code as DNS_UPD_NO_IP_FOUND, and increment
>> the OTHER error.
>
> That is interesting, but I was asking about the second patch,
> 0002-BUG-MINOR-dns-unable-to-parse-CNAMEs-response.patch

Ah, ok!
Anyway, your mail made me read my patches and find the ugly thing in
the other patch :)

So, when you write
   if (cname && memcmp(ptr, cname, cnamelen))
   return DNS_UPD_NAME_ERROR;
  else if (memcmp(ptr, dn_name, dn_name_len))
return DNS_UPD_NAME_ERROR;

your compare cname againt name in current record only if cname is set.
In Ben's case, cname is set and ptr and cname comparison was true,
hence memcmp returned 0.
Since memcmp returns 0, then HAProxy checks the next condition and
compare ptr to dn_name, which lead to return the DNS_UPD_NAME_ERROR
since we're evaluating a cname and ptr points to the CNAME while
dn_name points to the queried name.

Basically, the code parsed the first response record, the CNAME, then
returned an error because the value of the cname does not match
anymore the name in the A record.

With the code below, when cname is set, there is no chance you compare
ptr and dn_name...
   if (cname) {
  if (memcmp(ptr, cname, cnamelen)) {
   return DNS_UPD_NAME_ERROR;
   }
   }
  else if (memcmp(ptr, dn_name, dn_name_len))
return DNS_UPD_NAME_ERROR;

Baptiste



Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-28 Thread Baptiste
On Wed, Oct 28, 2015 at 6:39 PM, Ben Tisdall  wrote:
> On Wed, Oct 28, 2015 at 6:28 PM, Ben Tisdall  wrote:
>> On Wed, Oct 28, 2015 at 6:00 PM, Baptiste  wrote:
>>>
>>> Ben, could you apply the patch below instead of 0001:
>>>
>>> [snip]
>
> That patch is proving problematic to apply, to save me guessing can
> you provide it as an attachment please.

Hi Ben,

Here you go.

Baptiste
From c96ec88f274689f5dd5b7efd403fccbc8837e748 Mon Sep 17 00:00:00 2001
From: Baptiste Assmann 
Date: Wed, 28 Oct 2015 02:03:32 +0100
Subject: [PATCH 1/2] BUG/MAJOR: dns: first DNS response packet not matching
 queried hostname may lead to a loop

The status DNS_UPD_NAME_ERROR returned by dns_get_ip_from_response and
which means the queried name can't be found in the response was
improperly processed (fell into the default case).
This lead to a loop where HAProxy simply resend a new query as soon as
it got a response for this status and in the only case where such type
of response is the very first one received by the process.

This should be backported into 1.6 branch
---
 src/server.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/src/server.c b/src/server.c
index dcc5961..c92623d 100644
--- a/src/server.c
+++ b/src/server.c
@@ -2620,6 +2620,17 @@ int snr_resolution_cb(struct dns_resolution *resolution, struct dns_nameserver *
 			}
 			goto stop_resolution;
 
+		case DNS_UPD_NAME_ERROR:
+			/* if this is not the last expected response, we ignore it */
+			if (resolution->nb_responses < nameserver->resolvers->count_nameservers)
+return 0;
+			/* update resolution status to OTHER error type */
+			if (resolution->status != RSLV_STATUS_OTHER) {
+resolution->status = RSLV_STATUS_OTHER;
+resolution->last_status_change = now_ms;
+			}
+			goto stop_resolution;
+
 		default:
 			goto invalid;
 
-- 
2.5.0



Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-28 Thread Jesse Hathaway
On Wed, Oct 28, 2015 at 12:00 PM, Baptiste  wrote:
> Good catch, forget about patch 1, It was 2AM in the morning when I
> wrote it :'(...
> I wanted to apply the same code as DNS_UPD_NO_IP_FOUND, and increment
> the OTHER error.

That is interesting, but I was asking about the second patch,
0002-BUG-MINOR-dns-unable-to-parse-CNAMEs-response.patch
From c5f95cda9cf66db99d6088af4ecf82568a4602b4 Mon Sep 17 00:00:00 2001
From: Baptiste Assmann 
Date: Wed, 28 Oct 2015 02:10:02 +0100
Subject: [PATCH 2/2] BUG/MINOR: dns: unable to parse CNAMEs response

A bug lied in the parsing of DNS CNAME response, leading HAProxy to
think the CNAME was improperly resolved in the response.

This should be backported into 1.6 branch
---
 src/dns.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/dns.c b/src/dns.c
index 53b65ab..e28e2a9 100644
--- a/src/dns.c
+++ b/src/dns.c
@@ -628,8 +628,11 @@ int dns_get_ip_from_response(unsigned char *resp, unsigned char *resp_end,
 		else
 			ptr = reader;
 
-		if (cname && memcmp(ptr, cname, cnamelen))
-			return DNS_UPD_NAME_ERROR;
+		if (cname) {
+		   if (memcmp(ptr, cname, cnamelen)) {
+return DNS_UPD_NAME_ERROR;
+			}
+		}
 		else if (memcmp(ptr, dn_name, dn_name_len))
 			return DNS_UPD_NAME_ERROR;
 
-- 
2.5.0



Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-28 Thread Ben Tisdall
On Wed, Oct 28, 2015 at 6:28 PM, Ben Tisdall  wrote:
> On Wed, Oct 28, 2015 at 6:00 PM, Baptiste  wrote:
>>
>> Ben, could you apply the patch below instead of 0001:
>>
>> [snip]

That patch is proving problematic to apply, to save me guessing can
you provide it as an attachment please.



Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-28 Thread Ben Tisdall
On Wed, Oct 28, 2015 at 6:00 PM, Baptiste  wrote:
>
> Ben, could you apply the patch below instead of 0001:
>
> [snip]

Sure, will report back in the morning. Thanks Jesse and Baptiste :)

Ben



Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-28 Thread Baptiste
Jesse

On Wed, Oct 28, 2015 at 5:25 PM, Jesse Hathaway  wrote:
> On Tue, Oct 27, 2015 at 8:18 PM, Baptiste  wrote:
>> #2 an error in the way we parse CNAME responses, leading to return an
>> error when validating a CNAME (this triggers bug #1).
>
> How does your patch for this issue change the logic? It appears
> functionally the same to me.

Good catch, forget about patch 1, It was 2AM in the morning when I
wrote it :'(...
I wanted to apply the same code as DNS_UPD_NO_IP_FOUND, and increment
the OTHER error.

Actually, the bug was triggered because the status of the resolution
was never updated in this very particular case (first DNS response,
can't find requested name in the response), which lead the code to
resend a packet, creating a loop.

Ben, could you apply the patch below instead of 0001:

diff --git a/src/server.c b/src/server.c
index dcc5961..c92623d 100644
--- a/src/server.c
+++ b/src/server.c
@@ -2620,6 +2620,17 @@ int snr_resolution_cb(struct dns_resolution
*resolution, struct dns_nameserver *
}
goto stop_resolution;

+   case DNS_UPD_NAME_ERROR:
+   /* if this is not the last expected response,
we ignore it */
+   if (resolution->nb_responses <
nameserver->resolvers->count_nameservers)
+   return 0;
+   /* update resolution status to OTHER error type */
+   if (resolution->status != RSLV_STATUS_OTHER) {
+   resolution->status = RSLV_STATUS_OTHER;
+   resolution->last_status_change = now_ms;
+   }
+   goto stop_resolution;
+
default:
goto invalid;


I'll also test it in our amazon lab later tonight.
Then I'll ask Willy to merge them.


Jesse, thanks again for catching this!


Baptiste



Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-28 Thread Jesse Hathaway
On Tue, Oct 27, 2015 at 8:18 PM, Baptiste  wrote:
> #2 an error in the way we parse CNAME responses, leading to return an
> error when validating a CNAME (this triggers bug #1).

How does your patch for this issue change the logic? It appears
functionally the same to me.



Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-28 Thread Ben Tisdall
On Wed, Oct 28, 2015 at 4:29 PM, Baptiste  wrote:
> Great, thanks for confirming!
>

Thanks for getting this sorted out Baptiste! Any idea of when the
fixes would be likely to be released and make it into the ppa?

-- 
Ben



Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-28 Thread Baptiste
Great, thanks for confirming!

Baptiste

On Wed, Oct 28, 2015 at 4:13 PM, Ben Tisdall  wrote:
> On Wed, Oct 28, 2015 at 3:04 PM, Baptiste  wrote:
>> Now, you can simply use whatever tool (ab, httperf, wrk, etc...)
>> hosted on a third party VM to inject traffic on ELB IP directly.
>> After a few minutes (less than 5), ELB service will be moved
>> automatically to an other instance, leading IP to change.
>> On HAProxy stat socket, you should see the 'update' counter to be
>> incremented to 1.
>> Of course, traffic load-balanced by HAProxy should followup as well.
>>
>
> Ok, I forced an address change as you described (good tip btw) and
> sure enough the "update" counter incremented by 1 and the proxy
> continued to function.
>
> --
> Ben



Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-28 Thread Ben Tisdall
On Wed, Oct 28, 2015 at 3:04 PM, Baptiste  wrote:
> Now, you can simply use whatever tool (ab, httperf, wrk, etc...)
> hosted on a third party VM to inject traffic on ELB IP directly.
> After a few minutes (less than 5), ELB service will be moved
> automatically to an other instance, leading IP to change.
> On HAProxy stat socket, you should see the 'update' counter to be
> incremented to 1.
> Of course, traffic load-balanced by HAProxy should followup as well.
>

Ok, I forced an address change as you described (good tip btw) and
sure enough the "update" counter incremented by 1 and the proxy
continued to function.

-- 
Ben



Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-28 Thread Baptiste
Now, you can simply use whatever tool (ab, httperf, wrk, etc...)
hosted on a third party VM to inject traffic on ELB IP directly.
After a few minutes (less than 5), ELB service will be moved
automatically to an other instance, leading IP to change.
On HAProxy stat socket, you should see the 'update' counter to be
incremented to 1.
Of course, traffic load-balanced by HAProxy should followup as well.

Baptiste


On Wed, Oct 28, 2015 at 2:05 PM, Ben Tisdall  wrote:
> On Wed, Oct 28, 2015 at 1:55 PM, Baptiste  wrote:
>
>>
>> Have you forced resolution to ipv4 only?
>> if not, could you give it a try?
>>
>
> Right, with "resolver-prefer ipv4":
>
> Resolvers section aws
>  nameserver aws_0:
>   sent: 11
>   valid: 11
>   update: 0
>   cname: 0
>   cname_error: 0
>   any_err: 0
>   nx: 0
>   timeout: 0
>   refused: 0
>   other: 0
>   invalid: 0
>   too_big: 0
>   truncated: 0
>   outdated: 0



Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-28 Thread Ben Tisdall
On Wed, Oct 28, 2015 at 1:55 PM, Baptiste  wrote:

>
> Have you forced resolution to ipv4 only?
> if not, could you give it a try?
>

Right, with "resolver-prefer ipv4":

Resolvers section aws
 nameserver aws_0:
  sent: 11
  valid: 11
  update: 0
  cname: 0
  cname_error: 0
  any_err: 0
  nx: 0
  timeout: 0
  refused: 0
  other: 0
  invalid: 0
  too_big: 0
  truncated: 0
  outdated: 0



Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-28 Thread Baptiste
On Wed, Oct 28, 2015 at 11:36 AM, Ben Tisdall  wrote:
> On Wed, Oct 28, 2015 at 10:15 AM, Ben Tisdall  
> wrote:
>>
>> Thanks Baptiste, will get on this today.
>>
>
> Ok this in the test environment now and the "other" counter now
> increments in step with "valid", eg:
>
> Resolvers section aws
>  nameserver aws_0:
>   sent: 208
>   valid: 104
>   update: 0
>   cname: 0
>   cname_error: 0
>   any_err: 0
>   nx: 0
>   timeout: 0
>   refused: 0
>   other: 104
>   invalid: 0
>   too_big: 0
>   truncated: 0
>   outdated: 0
>
> We'll get some (system-wide) load and regression testing done.
>
> --
> Ben

Have you forced resolution to ipv4 only?
if not, could you give it a try?

Baptiste



Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-28 Thread Ben Tisdall
On Wed, Oct 28, 2015 at 10:15 AM, Ben Tisdall  wrote:
>
> Thanks Baptiste, will get on this today.
>

Ok this in the test environment now and the "other" counter now
increments in step with "valid", eg:

Resolvers section aws
 nameserver aws_0:
  sent: 208
  valid: 104
  update: 0
  cname: 0
  cname_error: 0
  any_err: 0
  nx: 0
  timeout: 0
  refused: 0
  other: 104
  invalid: 0
  too_big: 0
  truncated: 0
  outdated: 0

We'll get some (system-wide) load and regression testing done.

-- 
Ben



Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-28 Thread Ben Tisdall
On Wed, Oct 28, 2015 at 2:18 AM, Baptiste  wrote:
> Ben,
>
> I found a couple of bugs:
> #1 an incomplete end of processing when the queried hostname can't be
> found in the response. This lead to the query loop you may have
> observed.
> #2 an error in the way we parse CNAME responses, leading to return an
> error when validating a CNAME (this triggers bug #1).
>
> Please find in attachment a couple of patches you could give a try and
> report whether you still have an issue or not.
>

Thanks Baptiste, will get on this today.

-- 
Ben



Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-27 Thread Baptiste
Ben,

I found a couple of bugs:
#1 an incomplete end of processing when the queried hostname can't be
found in the response. This lead to the query loop you may have
observed.
#2 an error in the way we parse CNAME responses, leading to return an
error when validating a CNAME (this triggers bug #1).

Please find in attachment a couple of patches you could give a try and
report whether you still have an issue or not.

Baptiste
From 67687363df5e2b5c82f12ecf2c560d22f9da795c Mon Sep 17 00:00:00 2001
From: Baptiste Assmann 
Date: Wed, 28 Oct 2015 02:03:32 +0100
Subject: [PATCH 1/2] BUG/MAJOR: dns: DNS response packet not matching queried
 hostname may lead to a loop

The status DNS_UPD_NAME_ERROR returned by dns_get_ip_from_response and
which means the queried name can't be found in the response was
improperly processed (fell into the default case).
This lead to a loop where HAProxy simply resend a new query as soon as
it got a response for this status

This should be backported into 1.6 branch
---
 src/server.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/server.c b/src/server.c
index dcc5961..0e0cab3 100644
--- a/src/server.c
+++ b/src/server.c
@@ -2603,6 +2603,7 @@ int snr_resolution_cb(struct dns_resolution *resolution, struct dns_nameserver *
 			}
 			goto stop_resolution;
 
+		case DNS_UPD_NAME_ERROR:
 		case DNS_UPD_SRVIP_NOT_FOUND:
 			goto save_ip;
 
-- 
2.5.0

From c5f95cda9cf66db99d6088af4ecf82568a4602b4 Mon Sep 17 00:00:00 2001
From: Baptiste Assmann 
Date: Wed, 28 Oct 2015 02:10:02 +0100
Subject: [PATCH 2/2] BUG/MINOR: dns: unable to parse CNAMEs response

A bug lied in the parsing of DNS CNAME response, leading HAProxy to
think the CNAME was improperly resolved in the response.

This should be backported into 1.6 branch
---
 src/dns.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/dns.c b/src/dns.c
index 53b65ab..e28e2a9 100644
--- a/src/dns.c
+++ b/src/dns.c
@@ -628,8 +628,11 @@ int dns_get_ip_from_response(unsigned char *resp, unsigned char *resp_end,
 		else
 			ptr = reader;
 
-		if (cname && memcmp(ptr, cname, cnamelen))
-			return DNS_UPD_NAME_ERROR;
+		if (cname) {
+		   if (memcmp(ptr, cname, cnamelen)) {
+return DNS_UPD_NAME_ERROR;
+			}
+		}
 		else if (memcmp(ptr, dn_name, dn_name_len))
 			return DNS_UPD_NAME_ERROR;
 
-- 
2.5.0



Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-27 Thread Baptiste
On Wed, Oct 28, 2015 at 12:13 AM, Baptiste  wrote:
> On Tue, Oct 27, 2015 at 11:44 AM, Ben Tisdall  
> wrote:
>> Hi and thanks for a great load balancer. We're developing a much more
>> complex proxy ruleset and being able to switch back to haproxy now
>> that it supports DNS resolution was a huge relief!
>>
>> Unfortunately DNS resolution is not doing what I expect given the
>> configuration. When the downstream ELB to which the server points to
>> switches IP addresses the backend is failing with a L4 timeout on the
>> check. DNS queries are being made, see:
>> https://gist.github.com/btisdall/31b57b57fee19dc79637
>>
>> This is the output of "show stat resolvers":
>>
>> Resolvers section aws
>>  nameserver aws_0:
>>   sent: 2892976
>>   valid: 2887729
>>   update: 0
>>   cname: 0
>>   cname_error: 0
>>   any_err: 0
>>   nx: 0
>>   timeout: 0
>>   refused: 0
>>   other: 0
>>   invalid: 2887729
>>   too_big: 0
>>   truncated: 0
>>   outdated: 0
>>
>> Note that  "valid" and "invalid" counts increase in exact step.
>> Switching to "resolve-prefer ipv4" had no effect on this.
>>
>> Config
>> =
>>
>> resolvers aws
>>   nameserver aws_0 10.111.0.2:53
>>
>> # ...
>>
>> server myserver some-server.example.com:80 check resolvers aws
>>
>> Build Options
>> ==
>>
>> HA-Proxy version 1.6.1 2015/10/20
>> Copyright 2000-2015 Willy Tarreau 
>>
>> Build options :
>>   TARGET  = linux2628
>>   CPU = generic
>>   CC  = gcc
>>   CFLAGS  = -g -O2 -fstack-protector --param=ssp-buffer-size=4
>> -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2
>>   OPTIONS = USE_ZLIB=1 USE_OPENSSL=1 USE_LUA=1 USE_PCRE=1
>>
>> Default settings :
>>   maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
>>
>> Encrypted password support via crypt(3): yes
>> Built with zlib version : 1.2.8
>> Compression algorithms supported : identity("identity"),
>> deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
>> Built with OpenSSL version : OpenSSL 1.0.1f 6 Jan 2014
>> Running on OpenSSL version : OpenSSL 1.0.1f 6 Jan 2014
>> OpenSSL library supports TLS extensions : yes
>> OpenSSL library supports SNI : yes
>> OpenSSL library supports prefer-server-ciphers : yes
>> Built with PCRE version : 8.31 2012-07-06
>> PCRE library supports JIT : no (USE_PCRE_JIT not set)
>> Built with Lua version : Lua 5.3.1
>> Built with transparent proxy support using: IP_TRANSPARENT
>> IPV6_TRANSPARENT IP_FREEBIND
>>
>> Available polling systems :
>>   epoll : pref=300,  test result OK
>>poll : pref=200,  test result OK
>>  select : pref=150,  test result OK
>> Total: 3 (3 usable), will use epoll.
>>
>> Regards,
>>
>> --
>> Ben
>>
>
>
> Hi Ben,
>
> I can't reproduce the problem with git version.
> I'll try with 1.6.1, but DNS code is supposed to be the same between
> both versions for now.
>
> I've setup the following amazon lab:
> - 1 instance with HAProxy running poininting to 1 ELB
> - 1 ELB instance taking traffic from haproxy above above and
> load-balancing haproxy's stats page from above server
> - 1 instance to inject traffic on ELB to force it to change its IP
> address after a few minutes
>
> HTTP stream is like: public > haproxy:8080 > elb:80 > haproxy:80
> It works like a charm.
> I triggered a DNS change on ELB by massiveley injecting traffic and
> here is the output of DNS stats:
>
> Resolvers section aws
>  nameserver aws1:
>   sent: 95
>   valid: 95
>   update: 1
>   cname: 0
>   cname_error: 0
>   any_err: 0
>   nx: 0
>   timeout: 0
>   refused: 0
>   other: 0
>   invalid: 0
>   too_big: 0
>   truncated: 0
>   outdated: 0
>
>
> Here is my configuration:
>
> global
>  daemon
>  log 127.0.0.1:514 local0 info
>  stats socket /tmp/socket level admin
>  stats timeout 10m
>
> resolvers aws
>  nameserver aws1 172.31.0.2:53
>
> defaults HTTP
>  mode http
>  timeout client 10s
>  timeout connect 4s
>  timeout server 10s
>
> frontend f
>  bind :8080
>  default_backend b
>
> backend b
>  server s ${LBNAME}:80 check resolvers aws resolve-prefer ipv4
>
> frontend s
>  bind :80
>  stats enable
>  stats uri /stats
>  stats show-legends
>  http-request redirect location /stats if { path / }
>
>
>
> Please take a real pcap file using tcpdump and send it to me privately.
>
> You also seem to use a CNAME which points to your ELB amazon name.
> Could you let me know how you setup this, so I can try to reproduce
> the issue in my lab?
>
> Maybe the CNAME parsing is broken.
>
> Baptiste


Ok, I use my personal domain name to create a CNAME pointing to my
internal ELB name and I can now reproduce the problem:
Resolvers section aws
 nameserver aws1:
  sent: 10485
  valid: 10469
  update: 0
  cname: 0
  cname_error: 0
  any_err: 0
  nx: 12
  timeout: 0
  refused: 0
  other: 0
  invalid: 10469
  too_big: 0
  truncated: 0
  outdated: 0

Now, let's dig in there :)

Baptiste



Re: DNS resolution problem on 1.6.1-1ppa1~trusty

2015-10-27 Thread Baptiste
On Tue, Oct 27, 2015 at 11:44 AM, Ben Tisdall  wrote:
> Hi and thanks for a great load balancer. We're developing a much more
> complex proxy ruleset and being able to switch back to haproxy now
> that it supports DNS resolution was a huge relief!
>
> Unfortunately DNS resolution is not doing what I expect given the
> configuration. When the downstream ELB to which the server points to
> switches IP addresses the backend is failing with a L4 timeout on the
> check. DNS queries are being made, see:
> https://gist.github.com/btisdall/31b57b57fee19dc79637
>
> This is the output of "show stat resolvers":
>
> Resolvers section aws
>  nameserver aws_0:
>   sent: 2892976
>   valid: 2887729
>   update: 0
>   cname: 0
>   cname_error: 0
>   any_err: 0
>   nx: 0
>   timeout: 0
>   refused: 0
>   other: 0
>   invalid: 2887729
>   too_big: 0
>   truncated: 0
>   outdated: 0
>
> Note that  "valid" and "invalid" counts increase in exact step.
> Switching to "resolve-prefer ipv4" had no effect on this.
>
> Config
> =
>
> resolvers aws
>   nameserver aws_0 10.111.0.2:53
>
> # ...
>
> server myserver some-server.example.com:80 check resolvers aws
>
> Build Options
> ==
>
> HA-Proxy version 1.6.1 2015/10/20
> Copyright 2000-2015 Willy Tarreau 
>
> Build options :
>   TARGET  = linux2628
>   CPU = generic
>   CC  = gcc
>   CFLAGS  = -g -O2 -fstack-protector --param=ssp-buffer-size=4
> -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2
>   OPTIONS = USE_ZLIB=1 USE_OPENSSL=1 USE_LUA=1 USE_PCRE=1
>
> Default settings :
>   maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
>
> Encrypted password support via crypt(3): yes
> Built with zlib version : 1.2.8
> Compression algorithms supported : identity("identity"),
> deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
> Built with OpenSSL version : OpenSSL 1.0.1f 6 Jan 2014
> Running on OpenSSL version : OpenSSL 1.0.1f 6 Jan 2014
> OpenSSL library supports TLS extensions : yes
> OpenSSL library supports SNI : yes
> OpenSSL library supports prefer-server-ciphers : yes
> Built with PCRE version : 8.31 2012-07-06
> PCRE library supports JIT : no (USE_PCRE_JIT not set)
> Built with Lua version : Lua 5.3.1
> Built with transparent proxy support using: IP_TRANSPARENT
> IPV6_TRANSPARENT IP_FREEBIND
>
> Available polling systems :
>   epoll : pref=300,  test result OK
>poll : pref=200,  test result OK
>  select : pref=150,  test result OK
> Total: 3 (3 usable), will use epoll.
>
> Regards,
>
> --
> Ben
>


Hi Ben,

I can't reproduce the problem with git version.
I'll try with 1.6.1, but DNS code is supposed to be the same between
both versions for now.

I've setup the following amazon lab:
- 1 instance with HAProxy running poininting to 1 ELB
- 1 ELB instance taking traffic from haproxy above above and
load-balancing haproxy's stats page from above server
- 1 instance to inject traffic on ELB to force it to change its IP
address after a few minutes

HTTP stream is like: public > haproxy:8080 > elb:80 > haproxy:80
It works like a charm.
I triggered a DNS change on ELB by massiveley injecting traffic and
here is the output of DNS stats:

Resolvers section aws
 nameserver aws1:
  sent: 95
  valid: 95
  update: 1
  cname: 0
  cname_error: 0
  any_err: 0
  nx: 0
  timeout: 0
  refused: 0
  other: 0
  invalid: 0
  too_big: 0
  truncated: 0
  outdated: 0


Here is my configuration:

global
 daemon
 log 127.0.0.1:514 local0 info
 stats socket /tmp/socket level admin
 stats timeout 10m

resolvers aws
 nameserver aws1 172.31.0.2:53

defaults HTTP
 mode http
 timeout client 10s
 timeout connect 4s
 timeout server 10s

frontend f
 bind :8080
 default_backend b

backend b
 server s ${LBNAME}:80 check resolvers aws resolve-prefer ipv4

frontend s
 bind :80
 stats enable
 stats uri /stats
 stats show-legends
 http-request redirect location /stats if { path / }



Please take a real pcap file using tcpdump and send it to me privately.

You also seem to use a CNAME which points to your ELB amazon name.
Could you let me know how you setup this, so I can try to reproduce
the issue in my lab?

Maybe the CNAME parsing is broken.

Baptiste