Package: bind9
Version: 1:9.5.1.dfsg.P1-1
Severity: important

Since we upgraded our bind9 name servers from Etch to Lenny we are
experiencing occasional hangs. While all requests for authoritative zones are
still answered correctly we can't seem to get replies for recursive queries.
All we get is NXDOMAIN until we init.d/restart the bind process. Every tenth
or so request is answered properly but the next request fails again with
NXDOMAIN. So the successful response from the root servers doesn't seem to get
served from the internal cache either.

In that situation our log file fills up with:

Mar  4 07:21:45 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:21:56 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:22:07 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:22:57 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:22:58 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:22:59 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:23:03 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:23:09 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:23:32 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:23:36 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:23:37 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:23:43 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:26:23 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:26:34 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:31:35 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:32:33 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:32:35 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:32:45 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:33:47 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:33:51 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:33:54 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:33:54 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:33:56 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:33:56 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:33:58 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:33:58 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:33:58 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:34:00 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:34:03 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:34:12 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:34:12 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:34:13 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:34:13 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found
Mar  4 07:34:13 pns named[7077]: general: checkhints: unable to get root NS 
rrset from cache: not found

We traced (tshark) what's happening on the network and it seems like bind9
isn't even sending out requests to the internet if we send it a recursive
query from inside/LAN. Instead is instantly replies with NXDOMAIN.

This situation is happening every few days and requires a bind restart or else
our clients can't run recursive queries any more (which apparently isn't
making them happy).

Our name server serves nearly 500 authoritative zones and is used as a
forwarder for the internal/LAN clients. "rndc status" shows:

==========================================
version: 9.5.1-P1
number of zones: 511
debug level: 0
xfers running: 0
xfers deferred: 0
soa queries in progress: 0
query logging is OFF
recursive clients: 6/0/1000
tcp clients: 0/100
server is up and running
==========================================

Our named.conf* looks basically like this:

==========================================
options {
    directory "/var/cache/bind";
    max-cache-size 10m;
    datasize unlimited;
    stacksize unlimited;
    coresize default;
    auth-nxdomain no;
    check-names master ignore;
    transfers-per-ns 50;
    transfers-in 20;
    cleaning-interval 0;
    transfer-format one-answer;
    notify yes;
    allow-recursion { internal; };
    allow-query { internal; };
    allow-transfer { foo; bar; };
    also-notify { x.x.x.x; x.x.x.x; };
};

logging {
        category "lame-servers" { null; };
        channel default_syslog { syslog daemon; print-category yes; };
        category "default" { "default_syslog"; "default_debug"; };
};

key "rndc-key" {
      algorithm hmac-md5;
      secret "1D0NtG1V3Y0U0UrS3Cr3tK3Y==";
};

controls { 
    inet 127.0.0.1 port 953 allow { 127.0.0.1; } keys { "rndc-key"; }; 
};
==========================================

The only other log message that was worrying us is:

Mar  4 07:18:03 pns named[7077]: general: max open files (1024) is smaller than 
max sockets (4096)

Let us know if there is anything else we can debug in case of that situation.

Cheers
 Christoph



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to