On Tue, Apr 3, 2018 at 4:17 PM, Dale Smith <dal...@gmail.com> wrote:

> Hello All,
>
> I have noticed a potential issue in HAProxy 1.7 where a backend server has
> reported in logs an 'unspecified DNS error' and been marked down for
> maintenance.
>
> After some investigation, I have found that our internal DNS server is
> responding with non-lowercase domain names in the Answer section. When
> HAProxy 'check' performs the lookup in dns.c the validation on the Answer
> domain with memcmp fails.
>
> I'm trying to understand what system is at fault here; the DNS server for
> not responding with the same case as the query, or HAProxy which should be
> performing a case insensitive match.
> I have made some test changes to src/dns.c to implement case insensitive
> matching in line 260 (in dns_resolve_recv) and unifying case before
> returning in dns_read_name. After these changes I can no longer reproduce
> the problem. I'm happy to share my patch file, but expect that your own
> libs and implementation will be cleaner.
> Relevant RFC: https://tools.ietf.org/html/rfc4343#section-3
>
> The current viable workaround appears to be simply matching the case that
> the DNS server is configured to reply with. For us this means specifying
> something like 'Server.db.example.com' instead of 'server.db.example.com'
> in haproxy.cfg.
>
>
> Reproduction steps:
> * Use 'check' and 'resolvers' in server section to enable
> post-initialisation dns lookups.
> * Have the DNS resolver reply in upper/lower/mixed case that does not
> match config file. (I can share a small python script that can be used to
> test this)
> The initial resolution during config read is fine, as this is done via
> libc. The subsequent DNS resolutions during checks fail and eventually set
> the server as offline.
>
> ---
> Sample configuration to reproduce:
> defaults
>     log                     global
>     mode                    tcp
>     timeout connect         10s
>     timeout client          1m
>     timeout server          1m
>     timeout check           10s
>
> resolvers dns-resolver
>     nameserver ns1 127.0.0.1:1153
>     resolve_retries      3
>     timeout retry        1s
>     hold valid           2s # small, so we can reproduce quickly
>     hold other           10s
>
> frontend fe15000
>     bind :15000
>     default_backend be15000
>
> backend be15000
>     server dest15000 example.com:5000 check resolvers dns-resolver
> init-addr libc,none
> ---
>
> Sample DNS response that triggers the issue:
> (Note the reply CaSe does not match query. I can also provide a simple
> Python server that performs uppercase in its reply, for replication of
> this.)
> ---
> $ dig @127.0.0.1 -p 1153 example.com
> <snip>
> ;; QUESTION SECTION:
> ;EXAMPLE.COM. IN A
>
> ;; ANSWER SECTION:
> EXAMPLE.COM. 60 IN A 127.0.0.1
> <snip>
>
>
> Thanks for any assistance,
> Dale Smith
>
>

Hi Dale,

Thanks for the report!
Please share your patch here and I'll have a look, so we could merge it.

Baptiste

Reply via email to