Colleagues,

In Geoff Huston's recent ISP Column "Roll Over and Die?" 
(http://www.potaroo.net/ispcol/2010-02/rollover.pdf), Roy Arends made a 
thorough analysis of the behavior of Unbound in the face of increased traffic 
towards authoritative servers after a failed key-rollover.

Key of Roy's analysis is the observation that Unbound holds back after finding 
a bogus DNSKEY but does that on a per query instead of a per zone basis.

> The default value of 60 seconds causes UNBOUND to restrain itself. However, 
> since its a per-message cache, it only restrains itself for that 
> qname/qclass/qtype tuple. Hence, if a different query is asked, UNBOUND needs 
> to validate the response, sees a bogus DNSKEY in the cache and starts to 
> re-fetch the dnskey keyset. In other words, a lame root key will cause DNSKEY 
> queries for every unique query seen per 60 second window.


We will address this using a caching mechanism that will treat DNSSEC 
validation failures on a zone wide basis instead of treating them as 
intermittent RR-set failures. That should reduce the traffic to authoritative 
servers significantly.

The reason why this particular problem is interesting is that, as developers, 
we are constantly trying to make the tradeoff between the ability to recover 
from failure and the costs that those recovery mechanism impose on third 
parties. Failure to validate a signature can have many reasons, varying from 
misconfiguration or synchronization failure at the authoritative side, to 
on-path failure or attack, to misconfiguration a the receiving side. In this 
case we have not been conservative enough when making the trade-offs. 

The fact that these sort of issues are identified are a healthy sign of what is 
still early deployment and we are eager to learn from these experiences. We use 
two resources for gathering experience that can help us making implementation 
choices: the IETF DNSOP working group and OARC (https://www.dns-oarc.net/). 
OARC is an organization where data is collected and shared so that impact of 
certain implementation behavior is quantified. We would like to ask people to 
contribute measurement data and share experiences. 

Back to the particular issue of stale keys. The column points out that there 
are mechanisms to prevent stale keys being retained after a key rollover: the 
mechanism described in RFC5011. As of version 1.4.0 Unbound has native support 
for maintaining the trust-anchor for key-rollovers based on RFC5011. We have 
also made "autotrust" <link> available for cases where trust-anchors need to be 
maintained  and Unbound is not used.

In the particular case described in the columnm, RFC5011 methodology might not 
have worked; an old OS distribution carrying a stale key that is several 
generations old cannot be tracked using RFC5011 techniques. Wijngaards and 
Kolkman have been working on a proposal to fix that particular issue: "DNSSEC 
Trust Anchor History Service" 
(http://tools.ietf.org/html/draft-wijngaards-dnsop-trust-history).


-- Olaf Kolkman
   NLnet Labs


________________________________________________________ 

Olaf M. Kolkman                        NLnet Labs
                                       Science Park 140, 
http://www.nlnetlabs.nl/               1098 XG Amsterdam

_______________________________________________
Unbound-users mailing list
[email protected]
http://unbound.nlnetlabs.nl/mailman/listinfo/unbound-users

Reply via email to