Hi
I have a problem with my bind server, and I think this may be bug. Maybe some
people here can help me verify that?
Short version: When a NS has A and AAAA Records with different TTLs, a bind
with only IPv4 fails to resolve an address once the A-Record expires and only
the AAAA is left.
Long version:
I have a RHEL 9.7 server running the official ISC bind docker image (v9.20.18,
but also tested with v9.21.17) with the docker version 3:29.1.2-1.el9. My
network does not have IPv6, sadly. This instance is my main recursive resolver.
My docker-compose.yml is
services:
named-prod:
container_name: named-prod
hostname: named-prod
image: internetsystemsconsortium/bind9:9.20
# Overruling the entrypoint to start the daemon with the "-4" option, to
disable IPv6 communication
entrypoint: /usr/sbin/named -f -c /etc/bind/named.conf -u bind -4
# Directly attaching the container to the OS in order so see the real
Source-IP in logs and ACLs
network_mode: host
volumes:
- etc-bind:/etc/bind
- cache:/var/cache/bind
- lib:/var/lib/bind
- log:/var/log
restart: unless-stopped
volumes:
etc-bind:
cache:
lib:
log:
The named.conf is:
acl rec-queries {
10.0.0.0/8;
192.168.0.0/16;
172.16.0.0/12;
127.0.0.0/8;
::1;
FE80::;
};
options {
directory "/var/cache/bind";
allow-transfer { };
allow-recursion { rec-queries; };
notify no;
hostname "unknown";
version "unknown";
listen-on { any; };
};
controls {
inet 127.0.0.1 allow { localhost; } keys { rndc-key; };
};
logging {
channel default_syslog {
stderr; # log to stderr so it's in the docker logs
print-time yes;
severity dynamic; # log at the server's current debug level
};
category default { default_syslog; };
};
include "/etc/bind/named.conf.hint_rfc1912";
So, when I resolve www.semigator.de at first, the server successfully resolves
it and this is the data in the cache. (semigator is hosted on the haufegroup
nameservers)
# rm /var/lib/docker/volumes/named-prod_cache/_data/named_dump.db
# docker exec -ti named-prod /usr/sbin/rndc dumpdb -cache
# grep -iE
"semigator|78.138.66.90|haufegroup|192.174.68.103|2001:67c:1bc::103|176.97.158.103|2001:67c:10b8::103"
/var/lib/docker/volumes/named-prod_cache/_data/named_dump.db
haufegroup.com. 172800 NS ns1.haufegroup.de.
172800 NS ns2.haufegroup.com.
ns2.haufegroup.com. 3600 A 176.97.158.103
20260205000000 20260115000000 36627
haufegroup.com.
172800 AAAA 2001:67c:10b8::103
haufegroup.de. 86400 NS ns1.haufegroup.de.
86400 NS ns2.haufegroup.com.
ns1.haufegroup.de. 3600 A 192.174.68.103
20260205000000 20260115000000 20306
haufegroup.de.
86400 AAAA 2001:67c:1bc::103
semigator.de. 86400 NS ns1.haufegroup.de.
86400 NS ns2.haufegroup.com.
http://www.semigator.de. 60 A 78.138.66.90
; ns1.haufegroup.de. [v4 TTL 60] [v4 success] [v6 unexpected]
; 192.174.68.103 [srtt 2322] [flags 00000004] [edns 2/0] [plain 0/0]
[udpsize 512] [ttl 60]
; ns2.haufegroup.com. [v4 TTL 60] [v4 success] [v6 unexpected]
; 176.97.158.103 [srtt 907] [flags 00000004] [edns 1/0] [plain 0/0]
[udpsize 512] [ttl 60]
; ns1.haufegroup.de. [v4 TTL 3600] [v4 success] [v6 unexpected]
; 192.174.68.103 [srtt 2322] [flags 00000004] [edns 2/0] [plain 0/0]
[udpsize 512] [ttl 60]
; ns2.haufegroup.com. [v4 TTL 3600] [v4 success] [v6 unexpected]
; 176.97.158.103 [srtt 907] [flags 00000004] [edns 1/0] [plain 0/0]
[udpsize 512] [ttl 60]
Easy to see, the A-Records of ns1 and ns2 have a TTL of 3600s, the AAAA-Records
have 172800s. So once the A-Record expires, there's only the AAAA left.
An hour later, when I try to resolve semigator, it fails:
# dig @localhost http://www.semigator.de
; <<>> DiG 9.16.23-RH <<>> @localhost http://www.semigator.de
; (2 servers found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 26831
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 2
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: 22fc2430e745765c01000000697774bd84bb9b8ec162e0ed (good)
; EDE: 22 (No Reachable Authority)
;; QUESTION SECTION:
;www.semigator.de. IN A
;; Query time: 2 msec
;; SERVER: ::1#3053(::1)
;; WHEN: Mon Jan 26 15:05:49 CET 2026
;; MSG SIZE rcvd: 201
In the Cache, there are no A-Records left, just the AAAA
# rm /var/lib/docker/volumes/named-prod_cache/_data/named_dump.db
# docker exec -ti named-prod /usr/sbin/rndc dumpdb -cache
# grep -iE
"semigator|78.138.66.90|haufegroup|192.174.68.103|2001:67c:1bc::103|176.97.158.103|2001:67c:10b8::103"
/var/lib/docker/volumes/named-prod_cache/_data/named_dump.db
haufegroup.com. 168049 NS ns1.haufegroup.de.
168049 NS ns2.haufegroup.com.
ns2.haufegroup.com. 168049 AAAA 2001:67c:10b8::103
haufegroup.de. 81649 NS ns1.haufegroup.de.
81649 NS ns2.haufegroup.com.
ns1.haufegroup.de. 81649 AAAA 2001:67c:1bc::103
semigator.de. 81649 NS ns1.haufegroup.de.
81649 NS ns2.haufegroup.com.
; ns1.haufegroup.de. [v4 TTL 9] [v4 failure] [v6 unexpected]
; ns2.haufegroup.com. [v4 TTL 9] [v4 failure] [v6 unexpected]
; http://www.semigator.de/A [ttl 0]
The logfile only gives a cryptic message:
26-Jan-2026 14:05:49.836 shut down hung fetch while resolving
0x7ff7888da400(http://www.semigator.de/A)
26-Jan-2026 14:05:49.836 shut down hung fetch while resolving
0x7ff7888da400(ns1.haufegroup.de/A)
26-Jan-2026 14:05:49.836 shut down hung fetch while resolving
0x7ff7888da400(ns2.haufegroup.com/A)
In the v9.21.17 version, the message is a bit better:
26-Jan-2026 14:06:56.132 gave up on resolving 'www.semigator.de/A'
26-Jan-2026 14:06:56.132 gave up on resolving 'ns1.haufegroup.de/A'
26-Jan-2026 14:06:56.132 gave up on resolving 'ns2.haufegroup.com/A'
Looking at all the TTL, haufegroup.com has glue-records with long the long TTL.
# dig @h.gtld-servers.net haufegroup.com ns
haufegroup.com. 172800 IN NS ns2.haufegroup.com.
haufegroup.com. 172800 IN NS ns1.haufegroup.de.
ns2.haufegroup.com. 172800 IN A 176.97.158.103
ns2.haufegroup.com. 172800 IN AAAA 2001:67c:10b8::103
If you ask the server itself, you get the short TTLs.
# dig @ns2.haufegroup.com haufegroup.com ns
haufegroup.com. 3600 IN NS ns1.haufegroup.de.
haufegroup.com. 3600 IN NS ns2.haufegroup.com.
ns2.haufegroup.com. 3600 IN AAAA 2001:67c:10b8::103
ns2.haufegroup.com. 3600 IN A 176.97.158.103
So my working theory is, for some reason, bind saves the shorter TTL for the
A-Record, and the longer for the AAAA. Once the A-Record expires, it tries to
resolve the domain via the AAAA-NS, but it can't, as it does not have an IPv6
IP. And so it simply fails.
Regards,
Christian
____________________________________________________________________________
WienIT GmbH, Thomas-Klestil-Platz 13, 1030 Wien,
FN 255649 f, Handelsgericht Wien, DVR: 2109667, UID-Nr. ATU61296118
--
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from
this list.