Ah yes, I also added the following "init-addr none" statement on the
server-template line.
This prevents HAProxy from using libc resolvers, which might end up in
unpredictible behavior in that enviroment....

Baptiste

On Tue, Jul 3, 2018 at 3:18 PM, Baptiste <[email protected]> wrote:

> Well, I can partially reproduce the issue you're facing and I can see some
> weird behavior of AWS's DNS servers.
>
> First, by default, HAProxy only support DNS over UDP and can accept up to
> 512 bytes of payload in the DNS response.
> DNS over TCP is not yet available and accepted payload size can be
> increased using EDNS0 extension.
>
> There is a "magic" number of SRV records with AWS and default HAProxy
> accepted payload size, at around 4 SRV records, the response payload may be
> bigger than 512 bytes.
> And so, AWS DNS server does not return any data, simply returns an empty
> response, with the TRUNCATED flag.
> In such case, a client is supposed to replay the request over TCP...
>
> An other magic value with AWS DNS servers is that it won't return more
> than 8 SRV records, even if you have 10 servers in your service. (even in
> TCP)
> AWS DNS servers will simply return a round robin list of the records, some
> will disappear, some will reappear at some point in time.
>
>
> Conclusion, to make HAProxy work in such environment, you want to
> configure it that way:
> resolvers awsdns
>   nameserver dns0 NAMESERVER:53     # <=== please remove the doule quotes
>   accepted_payload_size 8192                 # <=== workaround for too
> short accepted payload
>   hold obsolete 30s                                   # <=== workaround
> for limited number of records returned by AWS
>
> You may want to read the documentation of HAProxy's resolver. There are a
> few other timeout / hold period you could tune.
>
> With the configuration above, I could easily scale from 2 to 10, back to
> 2, passing through 4, 8, etc... successfully and without any server
> flapping.
> I did not try to go higher than 10. Bear in mind the "hold obsolete"
> period is the period during which HAProxy considers a server as available
> even if the DNS server did not return it in the SRV record list.
>
> Baptiste
>
>
>
>
>
>
>
> On Tue, Jul 3, 2018 at 1:26 PM, Baptiste <[email protected]> wrote:
>
>> Answering myself... I found my way in the menu to be able to allow port
>> 9000 to read the stats page and to find the public IP associated to my
>> "app".
>> That said, I still can't get a shell on the running container, but I
>> think I found an AWS documentation page for this purpose.
>>
>> I keep you updated.
>>
>> On Tue, Jul 3, 2018 at 1:06 PM, Baptiste <[email protected]> wrote:
>>
>>> Hi Jim,
>>>
>>> I think I have something running...
>>> At least, terraform did not complain and I can see "stuff" in my AWS
>>> dashoard.
>>> Now, I have no idea how I can get connected to my running HAProxy
>>> container, neither how I can troubleshoot what's happening :)
>>>
>>> Any help would be (again) appreciated.
>>>
>>> Baptiste
>>>
>>>
>>>
>>> On Tue, Jul 3, 2018 at 11:39 AM, Baptiste <[email protected]> wrote:
>>>
>>>> Hi Jim,
>>>>
>>>> Sorry for the long pause :)
>>>> I was dealing with some travel, conferences and catching up on my
>>>> backlog.
>>>> So, the good news, is that this issue is now my priority :)
>>>>
>>>> I'll try to first reproduce it and come back to you if I have any issue
>>>> during that step.
>>>> (by the way, thanks for the github repo to help me speed up in that
>>>> step).
>>>>
>>>> Baptiste
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Jun 25, 2018 at 10:54 PM, Jim Deville <
>>>> [email protected]> wrote:
>>>>
>>>>> Hi Bapiste,
>>>>>
>>>>>
>>>>> I just wanted to follow up to see if you were able to repro and
>>>>> perhaps had a patch we could try?
>>>>>
>>>>>
>>>>> Jim
>>>>> ------------------------------
>>>>> *From:* Jim Deville
>>>>> *Sent:* Thursday, June 21, 2018 1:05:49 PM
>>>>> *To:* Baptiste
>>>>> *Cc:* [email protected]; Jonathan Works
>>>>> *Subject:* Re: Issue with parsing DNS from AWS
>>>>>
>>>>>
>>>>> Thanks for the reply, we were able to extract a minimal repro to
>>>>> demonstrate the problem: https://github.com/jg
>>>>> works/haproxy-servicediscovery
>>>>>
>>>>>
>>>>> The docker folder contains a version of the config we're using and a
>>>>> startup script to determine the local private DNS zone (AWS puts it at the
>>>>> subnet's +2).
>>>>>
>>>>>
>>>>> Jim
>>>>> ------------------------------
>>>>> *From:* Baptiste <[email protected]>
>>>>> *Sent:* Thursday, June 21, 2018 11:02:26 AM
>>>>> *To:* Jim Deville
>>>>> *Cc:* [email protected]; Jonathan Works
>>>>> *Subject:* Re: Issue with parsing DNS from AWS
>>>>>
>>>>> and by the way, I had a quick look at the pcap file and could not find
>>>>> anything weird.
>>>>> The function you're pointing seem to say there is not enough space to
>>>>> store a server's dns name, but the allocated space is larger that your
>>>>> current records.
>>>>>
>>>>> Baptiste
>>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to