On Jul 10, 2024, at 1:03 PM, Philip Homburg <pch-dnso...@u-1.phicoh.com> wrote:
I see several different directions this could go that might be useful. 1. "DNS at the 99th percentile" Rather than normatively declare limits on things like NS count or CNAME chain length, it would be interesting to measure behaviors out in the real world. How long can your CNAME chain be before resolution failure rates exceed 1%? How many NS records or RRSIGs can you add before 99% of resolvers won't try them all? That has a bit of risk that we need a new document every year. That’s fine. Not every useful DNS-related document has to be an IETF RFC. 2. "DNS Lower Limits" Similar to the current draft, but a change of emphasis: instead of setting upper bounds on the complexity of zones, focus on setting lower bounds on the capability of resolvers. This is the same thing. If some popular resolvers implement the lower bound then it effectively because an upper bound on the complexity of zones. That’s a pretty big “if”, especially when multiplied across all the recommendations in the draft. Even then, it wouldn’t apply to zones with an unusual client base. 3. "DNS Intrinsic Limits" Given the existing limits in the protocol (e.g. 64 KB responses, 255 octet names), document the extreme cases that might be challenging to resolve. This could be used to create a live test suite, allowing implementors to confirm that their resolvers scale to the worst-case scenarios. Why? Do we really care if a resolver limits the size of RRsets to 32 KB? Yes. Unnecessary limits restrict our flexibility even if mainstream use cases don’t exist today. Large RRsets have been considered in many contexts over the years, most recently for postquantum keys and signatures. Tests can help to make sure that resolvers don't crash. But they may just return early when they see something ridiculous. 4. "DNS Proof of Work" In most of these cases, the concern is that a hostile stub can easily request resolution of a pathological domain, resulting in heavy load on the recursive resolver. This is a problem of asymmetry: the stub does much less work than the resolver in each transaction. We could compensate for this by requiring the stub to increase the amount of work that it does. For example, we could * Recommend that resolvers limit the amount of work they will do for UDP queries, returning TC=1 when the limit is reached. That immediately prompts a question what the 'limit' is. The limit is not standards-relevant. It could be “10 milliseconds of CPU time” or “3 cache misses” or whatever. The stub doesn’t need to know; it just retries over TCP as already required. For example, a resolver could set TC=1 after encountering 2 CNAMEs. But I'm sure that will make a lot of people very unhappy. A resolver can return TC=1 for all UDP queries if it wants, and this is often discussed as a DoS defense mechanism. Returning TC=1 for 1% of queries should not be a serious problem for anyone. * Create a system where stubs pad their UDP query packets to prevent reflection-amplification attacks. That seems unrelated to this draft. * Develop a novel proof-of-work extension, e.g. a continuation system that requires the stub to reissue heavy queries several times before getting the answer. That raises exactly the same question: what is 'heavy’? Implementation-defined. There’s no need to standardize it; the stub just “continues” the query until it gets an answer or loses patience.
_______________________________________________ DNSOP mailing list -- dnsop@ietf.org To unsubscribe send an email to dnsop-le...@ietf.org