On Mon, Apr 20, 2020 at 1:27 PM Gavin McCullagh <gmccull...@gmail.com>
wrote:

> Hi,
>
> I'm new to posting on this list, so please accept my advance apologies if
> I make any novice errors or posted this in the wrong place.  Apologies also
> for the long email. :-)
>

Hi Gavin, and welcome to DNSOP! :) Thanks for your comments.

I can't claim to have the same detailed knowledge of the protocol as the
> authors of this draft.  All the same, I've been mulling this
> child/parent-centric resolver question for a while and watching its impact
> on our customers and developers.  This draft seems to resolve this question
> with the conclusion that child-centric (non-sticky?) is the correct
> behaviour.
>
> Shumon Huque:
> > There is a range of different behaviors in resolver implementations
> > in this respect today, and it would be good if we could agree on
> > more commonality.
>
> I agree.  Having a predictable standard behavior (at least for recent,
> well-behaved resolvers) is very desirable.
>
> The recent thread on the DNS OARC list seemed to frame this question as a
> trade off mostly of these points:
>
> [a] https://tools.ietf.org/html/rfc2181#section-5.4.1 saying the in-zone
> NS is authoritative and more trustworthy.  (pro: child-centric)
> [b] flexibility for DNS operators to lower the effective TTL on a
> delegation during changes despite registries fixing their TTL.   (pro:
> child-centric)
>      Paragraph 4 in
> https://tools.ietf.org/html/draft-huque-dnsop-ns-revalidation-01#section-2
> [c] the additional complexity resolvers will have to bear  (pro:
> parent-centric)
> [d] making resolution deterministic  (pro: parent-centric)
>
> I'd like to draw attention to a fifth item which I haven't see addressed.
>
> [e] obeying the principle of least astonishment for mortal DNS operators
> who do not understand this subtlety (and who I assume are the overwhelming
> majority).  (pro: parent-centric)
>
> The child-centric resolver behaviour applied by many resolvers today is
> probably clear to most who reads this list and DNS OARC.   But it's very
> counter-intuitive to everyone else.  In my experience the overwhelming
> majority of people operating DNS do not understand this very subtle point.
> They all tend to assume the parent-centric behaviour.  Perhaps that's
> because many of them have a software developer background and are familiar
> with tree data structures and linked lists.  Put another way, child-centric
> resolvers effectively insist on there being two sources of truth for a
> delegation, which is very surprising.
>

A few years ago, when working on another RFC, we were actually having
trouble explaining to some others (in the IETF no less!) that the DNS
namespace and cache were a tree data structure. So I'm pleased to hear this
:)

But appreciating the subtleties of the DNS delegation mechanism involves a
lot of arcane details that are not easy to understand for anyone. If the
namespace is a tree, and zones are contiguous subtrees, how do you a
visualize a zone cut in this data structure? It doesn't cleanly intersect
any node or edge between nodes, but rather partitions the data sets located
at a node in a weird, inconsistent fashion. Most apex RRsets live
authoritatively at the node, but one has a copy in the parent zone, and one
is authoritative in the parent. It's all very complicated.

[a] seems like a definition which could be changed if it was so decided.
> DS records are totally parent centric for example.  It seems like NS could
> be too if we declared the in-zone NS to be "informational only".  I
> understand the desire for [b], but it seems to propose to dictate a
> specific behavior at every level of delegation in order to work around a
> problem that only exists with second level delegations (those managed by
> registries).  In so doing, it optimizes for an arcane but admittedly useful
> flexibility that only a tiny minority of DNS operators will ever understand
> how to use.  At the same time, we'd be standardizing on behavior that
> surprises the majority of operators and at times causes them (or at least
> leads them into) outages, even when they are working at a 3rd or 4th level
> delegation (ie not a registry).
>

Changing (a) properly requires re-designing the DNS delegation mechanism,
and sounds like a much more ambitious undertaking to me. I am actually
interested in looking at that myself, because there are certainly
deficiencies that could be addressed. Since delegation records and glue
address records are unsigned, they can be spoofed, and DNSSEC should really
allow us to detect such spoofing once a resolver sees referral data. And
not (as it now is), after being sent on a wild goose chase to a set of
rogue servers, and subsequently determining that those servers cannot
present a DNSKEY RRset that matches the parent DS. Fixing this, requires
creating a new delegation record type that can be signed in the parent. And
the TTL issue and delegation revalidation would then have to be assessed by
looking at both the child NS set and this new delegation record (although
for secure delegations, the DS set could likely be used in place of the
latter). And then figuring out the large work of how to deploy it Internet
wide.

You are right that the TTL control issue primarily comes up at the TLD/SLD
level, but it is certainly not a corner case or that only affects a small
minority, in my estimation.

There are two categories of unfortunate "surprises" we commonly see due to
> child centric resolvers.  The first surprise is that when redelegating a
> third level domain, it's obvious to moderately experienced operators that
> they must lower the parent NS TTL in order to get a fast rollback, but as
> they don't realise the in-zone NS takes over, they don't lower that TTL.
> Now their fast rollback plan is ineffective on child-centric resolvers.
> It's great to see in this draft that the "delegation revalidation" section
> of the draft seems to solve that sharp edge by choosing the minimum TTL at
> the delegation.  If we conclude we must have child-centric behaviour, this
> at least makes it safer than today.   But still the misunderstanding points
> to the surprising behaviour of child-centric resolvers.   The second
> surprise category are a variety of subtle misconfigurations which we have
> seen at the in-zone NS which operators don't understand (copying the NS
> from the previous zone, altering the NS in an effort to get a full sideways
> delegation, just plain errors, etc).  When we explain these problems, our
> customers say "but the child NS isn't used for delegation, what are you
> talking about?".   "dig +trace" also ignores the in-zone NSes (that could
> be fixed of course, but it reinforces how people think about delegations).
>

I would hope that making resolver behavior consistent would address most of
this. If all resolvers locked on to the authoritative NS set, then most of
the surprises would disappear (hopefully).

PS How truly intractible is the registry argument?  It seems something like
> "When an NS change is made, TTL=3600 for the first N hours, then 2 days
> thereafter." would be a major step forward without drastically increasing
> complexity.
>

Based on history to date, it seems to be rather intractable, but I would
love to be proven wrong. The interfaces that registrars use to update
delegation records in the registries don't even offer any TTL configuration
option that I've seen (even if the registries were willing to support it).
Does the EPP DNS mapping even support setting TTL?

Shumon Huque
_______________________________________________
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop

Reply via email to