NXDOMAIN is not a "failure" response. Are you *sure* you're getting NXDOMAIN? 
If you're using nslookup to test, be aware that it will do suffix searching by 
default, so if the original query, e.g. www.bbc.co.uk  fails, it'll quietly 
(unless debug-mode is in effect) start appending suffixes. Looking up those 
suffixed names, e.g. www.bbc.co.uk.example.com, mostly likely gets an NXDOMAIN, 
so nslookup reports NXDOMAIN as the overall result of the query. So, it's 
basically a misreporting of the error by nslookup. 

Note that only 1 of the records in your cache dump is actually relevant -- the 
CNAME from www.bbc.co.uk to www.bbc.net.uk -- and the others are for a 
different part of the namespaces (thdow.bbc.co.uk).

If you do an explicit query of the CNAME, when the problem is occurring, does 
it resolve? I would expect, even though the cache entry is marked 
"pending-answer", it will still resolve. But, without the target of the CNAME 
also resolving, the lookup as a whole cannot succeed.

                                                                                
                        - Kevin

-----Original Message-----
From: bind-users-boun...@lists.isc.org 
[mailto:bind-users-boun...@lists.isc.org] On Behalf Of 
tpcb...@mklab.ph.rhul.ac.uk
Sent: Tuesday, January 26, 2016 8:02 PM
To: bind-users@lists.isc.org
Subject: Name resolution failure on a caching server -- many '; pending-answer' 
records in the cache

Dear All,
     I run a caching server on a section of the departmental LAN.
Occasionally network congestion results in timeouts & name resolution failures. 
 Lookups performed on name servers outside my LAN section fail with NXDOMAIN.  
Querying my name server for items not in its cache gets the same result.

My problem is that long after the congestion has subsided, queries to my name 
server still result in NXDOMAIN failure.  AFAICT this situation remains 
indefinitely, until the cache is flushed 'rndc flush' or the bind restarted.  
When it is in this state dumping the cache with 'rndc dumpdb' shows numerous 
entries like this,

--------------------------------------------------------------------------------------------
; pending-additional
thdow.bbc.co.uk.        76632   NS      ns3.bbc.net.uk.
                        76632   NS      ns4.bbc.co.uk.
                        76632   NS      ns4.bbc.net.uk.
                        76632   NS      ns3.bbc.co.uk.
; pending-answer
ns0.thdow.bbc.co.uk.    2082    \-AAAA  ;-$NXRRSET
; thdow.bbc.co.uk. SOA ns.bbc.co.uk. hostmaster.bbc.co.uk. 2015122100 1800 600 
864000 86400 ; pending-answer
                        76632   A       212.58.240.162
; pending-answer
www.bbc.co.uk.          30      CNAME   www.bbc.net.uk.
; glue
--------------------------------------------------------------------------------------------

and attempts to lookup eg. www.bbc.co.uk result in NXDOMAIN.

Browsing the documentation I noticed the parameter 'max-ncache-ttl'
which is unset in my named.conf and apparently defaults to 3hours.
However the problem persists long after 3hours has elapsed following incidents 
of network congestion.

I could setup a cronjob to check name resolution on external domains and flush 
the cache when it fails?  I am assuming there must be better solution!  Should 
I set max-ncache-ttl to something fairly short in my named.conf and hope that 
the default value is for some reason actually
>> 3hours?

BTW I there a way to dump out all the parameters from a running named
-- just to see all their values ?


Any ideas on how to solve or further diagnose the problem?

Many thanks
Tom Crane

System details:
OS:    Scientific Linux CERN SLC release 6.7 (Carbon) [NB: SLC is a derivative 
of RHEL]
BIND:  bind-9.8.2-0.37.rc1.el6_7.5.x86_64

Ps. I originally posted in Usenet NG comp.protocols.dns.bind but got no 
followups and then noticed all messages in that NG had this ML's fields 
'NNTP-Posting-Host: lists.isc.org' and 'X-Original-To: 
bind-users@lists.isc.org' etc. in their headers.  Is c.p.d.b actually a 
moderated group now or exclusively tied to this ML via a mail2news gateway?

-- 
Tom Crane, Dept. Physics, Royal Holloway, University of London, Egham Hill,
Egham, Surrey, TW20 0EX, England.
Email:  T dot Crane at rhul dot ac dot uk

_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Reply via email to