Hello All, I have noticed a potential issue in HAProxy 1.7 where a backend server has reported in logs an 'unspecified DNS error' and been marked down for maintenance.
After some investigation, I have found that our internal DNS server is responding with non-lowercase domain names in the Answer section. When HAProxy 'check' performs the lookup in dns.c the validation on the Answer domain with memcmp fails. I'm trying to understand what system is at fault here; the DNS server for not responding with the same case as the query, or HAProxy which should be performing a case insensitive match. I have made some test changes to src/dns.c to implement case insensitive matching in line 260 (in dns_resolve_recv) and unifying case before returning in dns_read_name. After these changes I can no longer reproduce the problem. I'm happy to share my patch file, but expect that your own libs and implementation will be cleaner. Relevant RFC: https://tools.ietf.org/html/rfc4343#section-3 The current viable workaround appears to be simply matching the case that the DNS server is configured to reply with. For us this means specifying something like 'Server.db.example.com' instead of 'server.db.example.com' in haproxy.cfg. Reproduction steps: * Use 'check' and 'resolvers' in server section to enable post-initialisation dns lookups. * Have the DNS resolver reply in upper/lower/mixed case that does not match config file. (I can share a small python script that can be used to test this) The initial resolution during config read is fine, as this is done via libc. The subsequent DNS resolutions during checks fail and eventually set the server as offline. --- Sample configuration to reproduce: defaults log global mode tcp timeout connect 10s timeout client 1m timeout server 1m timeout check 10s resolvers dns-resolver nameserver ns1 127.0.0.1:1153 resolve_retries 3 timeout retry 1s hold valid 2s # small, so we can reproduce quickly hold other 10s frontend fe15000 bind :15000 default_backend be15000 backend be15000 server dest15000 example.com:5000 check resolvers dns-resolver init-addr libc,none --- Sample DNS response that triggers the issue: (Note the reply CaSe does not match query. I can also provide a simple Python server that performs uppercase in its reply, for replication of this.) --- $ dig @127.0.0.1 -p 1153 example.com <snip> ;; QUESTION SECTION: ;EXAMPLE.COM. IN A ;; ANSWER SECTION: EXAMPLE.COM. 60 IN A 127.0.0.1 <snip> Thanks for any assistance, Dale Smith