On 2/2/2010 5:25 PM, Rob Tanner wrote:
Hi,

We have two registered name servers to answer internet queries. One is on site and the other is a service of our ISP. The problem is that every once in a while the secondary server doesn't successfully complete zone transfers and the data expires.
As the domain owner, you can set the expiration interval in the SOA record. Sounds like you set the interval way too short. Expiration intervals of a month or more are common. You need to give yourself enough time to detect that replication is broken, and fix it.

As for detecting whether replication is broken or not, you could implement either a) scanning your logs to ensure that refresh queries and/or zone transfers are occurring, or b) probing the secondary/secondaries to see if their version of the zone is up to date (look at the serial number in the SOA),
c) both of the above

These measures may require that you force updates on a regular basis, which you can do by making some sort of "dummy" change, if no "real" changes have been made recently.

I'm not sure what technically how the server answers when queried for addresses it no longer thinks are valid, but even after it's fixed it takes a while for the bad data to go away.
If the zone truly *expires* then a nameserver will typically give a SERVFAIL response for any query of a name in the zone.

Since SERVFAIL usually isn't cached very long, I'm assuming that by "bad data" you mean the staleness of the data leading up to the eventual expiration of the zone. Once replication connectivity has been restored, then a refresh would be forced on the slave, and it would have a current version of the zone (how you do that depends on what DNS implementation you're using, in extreme cases it might be necessary to delete a file, or completely drop and re-add the slave-zone definition in the config).

As for stale records cached in non-authoritative servers, you as the domain owner control the persistence of those through your TTL settings on the individual records, and the negative-TTL setting in the SOA record for the zone.

What I'm wondering is, what are the consequences of simply not using the secondary server. Right now we are looking at hardened appliances configured into a high availability cluster and I figure the pipe to the outside has a high likelihood of going down then does the cluster. So, if name servers out in the internet can't even reach our server because our connection is down, is that something that also propagates and get's cached (i.e. Is no data treated the same as bad data by upstream bind servers?
Well, the Internet standards explicitly require at least 2 nameservers, and if the zone/zones is/are delegated directly from a registry, usually they enforce that rule.

It would not be advisable to only have 1 delegated nameserver. Many apps distinguish between "host not found" and "cannot connect" errors. In the case of a store-and-forward subsystem like SMTP mail, for instance, the former is fatal while the latter will be retried for some period of time.

Also, more generally, putting all your eggs in one basket -- "if this fails then this other thing probably won't work either" -- rarely forms the basis of a responsible disaster recovery plan.

- Kevin

_______________________________________________
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Reply via email to