Re: serve-expired: "yes" and cache-min-ttl: 30 unsafe?

Marc Branchaud via Unbound-users Tue, 13 Nov 2018 08:02:42 -0800

On 2018-10-30 1:50 a.m., Nick Urbanik wrote:

Dear Marc,


Thank you for your reply.

On 29/10/18 10:14 -0400, Marc Branchaud via Unbound-users wrote:

On 2018-10-28 3:20 p.m., Nick Urbanik via Unbound-users wrote:

On 25/10/18 18:10 +1100, Nick Urbanik via Unbound-users wrote:

I am puzzled by the behaviour of our multi-level DNS system which
answered many queries for names having shorter TTLs with SERVFAIL.


I mean that SERVFAILs went up to 50% of replies, and current names
with TTLs of around 300 failed to be fetched by the resolver, the last
DNS servers in the chain.  What I mean is that adding these two
configuration options (serve-expired: "yes" and cache-min-ttl: 30)
caused an outage.  I am trying to understand why.

Any ideas in understanding the mechanism would be very welcome.

We use 1.6.8 with both those settings, and observed prolonged SERVFAILperiods.

In our case, the upstream server became inaccessible for a period oftime, but when contact resumed the SERVFAILs persisted.


This behaviour was quite catastrophic, and to me, unexpected.

Do you have any idea of the mechanism behind this failure?

Is there a way to deal better with zero TTL names?

We reduced the infra-host-ttl value to compensate.


(Sorry for my slow response -- this slipped through the cracks.)

Did that bring your system to a functioning condition?

Yes & no. We reduced infra-host-ttl to 30 seconds, which means that weare only affected by this for (up to) 30 seconds after upstream accessreturns. That is adequate for our purposes.

So I think the mechanism is pretty clear, and I think it's good forunbound to cache the upstream server's status for a period of time. I'mjust not convinced that 900 seconds is a reasonable default time.

(BTW, our case has nothing to do with zero TTL names: The IP addressconfigured as the zone's forward-addr became inaccessible. No namesinvolved. That said, I do not know how unbound deals with 0-TTL names.)

I do not think our case is a bug. It also has nothing to do withserve-expired or cache-min-ttl. But since we use those settings, Iwanted to relate our experience with a confusing SERVFAIL situation.

In your multi-level system, are you 100% sure that all the forward-addrIPs are *always* accessible? If they are, then you may be seeingSERVFAILs for a different reason.

M.

(Why is infra-host-ttl's default 900 seconds? That seems like a longtime to wait to retry the upstream server.)

M.

By multilevel, I mean clients talk to one server, which forwards to
another, and for some clients, there is a third level of caching.

So it was unwise to add:
serve-expired: "yes"
cache-min-ttl: 30

to the server section of these DNS servers running unbound 1.6.8 on
up to date RHEL 7?  Please could anyone cast some light on why this
was so?  I will be spending some time examining the cause.

If you need more information, please let me know.

Re: serve-expired: "yes" and cache-min-ttl: 30 unsafe?

Reply via email to