tonight's exchanges here related to "use-stale" seem discordant to me. i'd like to play the straight man for a moment and ask some indulgent person to bring me up to speed by way of correcting my impressions.

the DNS TTL field is a state management variable. in this case the held state is in the form of cached RRsets, and the TTL associated with the RRset describes the period of time during which they can be reused. by the original DNS specifications, after this reuse period, these RRsets are to be discarded, and if the data is still needed, it is re-fetched.

in practice, TTL expiry is often not discovered until the records are about to be reused; this avoids the cpu and memory bandwidth costs of sweeping the cache periodically in search of expiration-ready RRsets, and avoids the additional state requirements of threading these RRsets by TTL in addition to the standard cost of threading them by recency of use (to facilitate LRU based purge when the cache reaches its limit.) what this practice leads to is a "sudden concurrent need" for the RRset at the precise moment when it is being discarded.

in order to avoid simultaneous "not having" and "great need", some RDNS servers do in fact sweep their caches or perhaps thread their RRsets by TTL expiration, in order to pre-launch a refreshment query when the TTL still has some fraction (like 5%) or period (like one minute) remaining. this is non-ideal since we often find that we're refreshing data that will not be used soon or perhaps ever. work is underway by several teams to find a "tuning set" of variables and thresholds which will better predict reuse in order to avoid refresh costs for non-reuse.

another method that's been deployed of avoiding simultaneous "don't have" with "great need" is to liberally reinterpret TTL such that RRsets can be reused beyond their explicit TTL lifetime, while their refresh queries proceed in the background. commonly, the authority servers responsible for answering these refresh events are down or unreachable at the time of most acute need. therefore the term "serve stale" to indicate a state management method whereby stale (beyond its TTL) data is served for some period of time, measured in minutes or hours, until the authority server can be reached to either refresh the RRsets or verify that they have in fact disappeared.

the danger of TTL stretching is that reuse beyond TTL may cause RRsets that are in fact supposed to be unreachable, to be effectively reachable. examples include security-related takedown of criminal DNS servers or networks, or failover strategies where end systems will not try to reach their backup servers unless they cannot reach their primary servers, and the unreachability of those primary servers is hidden from them by TTL stretching. fundamentally, an RRset and its TTL are the property of the zone administrator, and it's controversial for any other party to use this data beyond its specified use parameters.

all of this trouble comes from DNS's use of a single state variable (TTL) to represent usability lifetime, rather than two such variables, one indicating the periodicity of refresh, the other indicating the periodicity of discard. many of us would like our data to be rechecked hourly by all caching servers who store it, but used for days or weeks if we become unreachable by some or all of those servers. using one variable for two purposes represents an inconvenient compromise which often provides "no right answer" as to setting. therefore an idealized solution would be to provide a second variable, and where that second variable is present, the meaning of the existing variable (TTL) could be subtly altered to support a two-variable setting.

therefore a "serve stale" team within IETF-DNSOP was convened, to try to standardize the methods and signal patterns necessary to extend the usability lifetime of records when their authority servers are not reachable at the time of normal TTL-based expiry. most of us recognize that TTL's will continue to be stretched no matter what changes are or are not made to the specification, and so we expect the resulting RFC to document current practice _without recommending it_ and to also document a new practice _with recommendations_ as to its proper uses.

there are hangups in signaling options due to the sloppy specification for EDNS, about which the author of EDNS0 feels just awful, believe me. however, we are all relatively sure that EDNS can be used to encode a desire for new state management behaviour, within the limitation that EDNS must first be signaled by the initiator before it can be answered by a responder, and we might wish it otherwise. that's why it was important to realize that if _any_ EDNS option is provided by an initiator, then _any_ EDNS option can be provided by a responder. in theory this means we could provide state management options in a response without having heard any state management options in a request -- so long as some form of EDNS was in fact used in the request. it's not yet clear that this evasive maneuver will be required, however.

the most straightforward signaling would be for an RD=0 initiator (normally a recursive DNS server) to ask some or all of its responders (normally authority servers) for permission to stretch the TTL. some responders will not answer this signal at all, some will say no, and some will say yes and give maximum tension values for the RRsets contained in the answer and authority sections -- but not for the additional section since that data might have a different authority server and may only be present as "glue". the new tension variable might be "maximum stretch interval" in which case the RRset's TTL _in this answer or authority section_ would be interpreted as a refresh interval. this system would allow gradual insertion of the new state management logic on an opportunistic basis -- motivated authority and recursive server operators, which would include CDN operators who must perform both services perfectly -- would be early adopters, and like ECS before it, the "hot" part of the community would be upgraded years earlier than the last outlier.

noone has proposed any new signaling between the stub and the recursive, but it's possible that a stub may want a true TTL and so we might add signaling from the stub (as initiator) saying, don't stretch, or perhaps saying, if this is a stretched TTL, tell me so explicitly.

if this understanding isn't wrong or incomplete, then i fail to see why there would be any drama that would prevent the construction of a draft.

P Vixie

