[DNSOP] ANAME discussion
(Starting a new thread so my mailer doesn't sort the new discussion with messages from November!) Thanks to Matthijs Mekking for the good summary this morning. I am happy for someone else to take over editorial/authorship duties on the draft. There were several useful points at the mic; thanks Paul Hoffman for noting them in the minutes (especially because I could not remember who said what). In no particular order... Petr Spacek said he thought zone transfers are difficult. In earlier revisions that is true; in the -02 revision there is nothing special about zone transfers, they are just the same as in the absence of ANAME. All the work has been moved to how the records get into the zone. (However the huge "as if" clause allows implementations to do funky things at zone transfer time if they want, and you won't be able to tell the difference from outside.) Matt Pounsett talked about strategies for polling the DNS at scale efficiently. Timing was something I worried about a lot when writing the draft, and I probably over-specified it. I eventually concluded that it isn't possible in many reasonable implementations to avoid significantly stretching the TTL of the ANAME target address records by copying them to the ANAME siblings. Maybe the spec should become more relaxed about this, and make it a quality-of-implementation issue rather than a requirement. Regarding geoip, ANAME is no worse than manually configured address records, and it will eventually become better as resolver support is deployed. Geoip is the main reason for specifying (optional) ANAME support in resolvers. (I guess another reason might be to more faithfully respect the target address TTLs...) Resolver support isn't necessary, though, provided the ANAME target addresses work everywhere. So it might not be compatible with all existing geoip implementations, but I don't think that's a bug in ANAME. Scalability is definitely a challenge. The worst case is when you have a very large number of zones sharing the same ANAME target, and that changes address. If ANAME is implemented using existing standard protocol features (i.e. UPDATE, NOTIFY, IXFR) as specified, then the change of address is going to cause a huge volume of modification traffic. But the "as if" clause exists to allow large scale implementations to do clever things to mitigate this. On the list, Brian Dickson and Dan York talked about "just as if there was a CNAME there”. The -02 draft tries to have semantics very close to CNAME, but there are other interesting possibilities if we relax that requirement. What I had in mind for an -03 revision was to remove any requirements about how the sibling address records are populated. The remaining requirement is that the sibling addresses must work the same way as the target addresses, from the point of view of the end user. The sibling addresses can freely be replaced by the target addresses at any time (provided whatever is doing the substitution can sign the records when that is necessary). This covers both draft -01 and -02 style implementations, and Oracle Dyn's notion of fallback addresses, and perhaps even vertically integrated providers that might prefer to direct client to thei own http 302 server instead of substituting addresses in the DNS. (Though they risk making too many assumptions about what the point of view of the end user is!) Dunno if this is a useful direction or not - argue away :-) Tony. -- f.anthony.n.finchhttp://dotat.at/ St Davids Head to Great Orme Head, including St Georges Channel: Southwest 3 or 4, occasionally 5 at first, becoming variable 3 or less, then north 4 or 5 later. Smooth or slight, becoming slight or moderate later in north. Fair. Good.___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] ANAME discussion
On Fri, Mar 29, 2019 at 9:59 PM Tony Finch wrote: > > Thanks to Matthijs Mekking for the good summary this morning. I am happy > for someone else to take over editorial/authorship duties on the draft. > I would be more than happy to help with this draft and to get in through the process. ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] ANAME discussion
Tony, Thanks for listing these points. I converted them to issues (together with some other issues that people mentioned last week and on the list). https://github.com/each/draft-aname/issues I am more than happy to take on the editorial work. Best regards, Matthijs On 3/29/19 9:58 PM, Tony Finch wrote: (Starting a new thread so my mailer doesn't sort the new discussion with messages from November!) Thanks to Matthijs Mekking for the good summary this morning. I am happy for someone else to take over editorial/authorship duties on the draft. There were several useful points at the mic; thanks Paul Hoffman for noting them in the minutes (especially because I could not remember who said what). In no particular order... Petr Spacek said he thought zone transfers are difficult. In earlier revisions that is true; in the -02 revision there is nothing special about zone transfers, they are just the same as in the absence of ANAME. All the work has been moved to how the records get into the zone. (However the huge "as if" clause allows implementations to do funky things at zone transfer time if they want, and you won't be able to tell the difference from outside.) Matt Pounsett talked about strategies for polling the DNS at scale efficiently. Timing was something I worried about a lot when writing the draft, and I probably over-specified it. I eventually concluded that it isn't possible in many reasonable implementations to avoid significantly stretching the TTL of the ANAME target address records by copying them to the ANAME siblings. Maybe the spec should become more relaxed about this, and make it a quality-of-implementation issue rather than a requirement. Regarding geoip, ANAME is no worse than manually configured address records, and it will eventually become better as resolver support is deployed. Geoip is the main reason for specifying (optional) ANAME support in resolvers. (I guess another reason might be to more faithfully respect the target address TTLs...) Resolver support isn't necessary, though, provided the ANAME target addresses work everywhere. So it might not be compatible with all existing geoip implementations, but I don't think that's a bug in ANAME. Scalability is definitely a challenge. The worst case is when you have a very large number of zones sharing the same ANAME target, and that changes address. If ANAME is implemented using existing standard protocol features (i.e. UPDATE, NOTIFY, IXFR) as specified, then the change of address is going to cause a huge volume of modification traffic. But the "as if" clause exists to allow large scale implementations to do clever things to mitigate this. On the list, Brian Dickson and Dan York talked about "just as if there was a CNAME there”. The -02 draft tries to have semantics very close to CNAME, but there are other interesting possibilities if we relax that requirement. What I had in mind for an -03 revision was to remove any requirements about how the sibling address records are populated. The remaining requirement is that the sibling addresses must work the same way as the target addresses, from the point of view of the end user. The sibling addresses can freely be replaced by the target addresses at any time (provided whatever is doing the substitution can sign the records when that is necessary). This covers both draft -01 and -02 style implementations, and Oracle Dyn's notion of fallback addresses, and perhaps even vertically integrated providers that might prefer to direct client to thei own http 302 server instead of substituting addresses in the DNS. (Though they risk making too many assumptions about what the point of view of the end user is!) Dunno if this is a useful direction or not - argue away :-) Tony. ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] ANAME discussion
On Fri, Mar 29, 2019 at 9:58 PM Tony Finch wrote: > There were several useful points at the mic; thanks Paul Hoffman for > noting them in the minutes (especially because I could not remember who > said what). In no particular order... Tim also mentioned that the vendors have their own secret sauce for handling the ANAME-like records. Let me share some details about the NS1's ALIAS record implementation. I hope it will be helpful when thinking about the remaining edge cases and also as an input for new implementers. The most significant difference from the current draft is that sibling A/ records have precedence over the ALIAS record. I think this behavior is closer to how CNAME is usually processed by a DNS server (i.e. first try to match the QTYPE then check if there is a CNAME) however I don't find this reasoning determinant. One advantage is that you can configure static record and use ALIAS to resolve only the A. The ALIAS records are resolved by our edge servers when needed and the result is cached. Nothing is written into the zone. We always resolve both A and at the same time and use the minimal TTL of the A and to determine how long to keep the result in the cache. The current TTL is also used in the answers (i.e. you will see the value drop when querying the same server several times). Also, if A nor exist at the ALIAS target, our authoritative server responds with SERVFAIL to indicate misconfiguration of the record. The ANAME from the draft would result in NODATA. We prefer SERVFAIL as we don't want the resolver to cache NODATA if there is an interim problem resolving the ALIAS target. Last but not least, we strip ALIAS from zone transfers. I guess that's all about the "secret sauce". We would like to move from ALIAS to ANAME when the draft becomes stable and other implementations start to emerge. The critical change for us is likely the A/ vs ANAME priorities. In case the primary server adds sibling A/ to the ANAME for compatibility with old resolvers, our implementation would always use the fallback records and ignore the ANAME if the zone was ingested over a transfer. We also have existing users that rely on the current behavior and we have to check that we won't break their setup if we change anything about the processing. I believe this aspect of sibling A/ was already discovered in a context of "zombie records" mentioned in https://github.com/each/draft-aname/issues/25. There will be some challenges but I'm really happy that ANAME is happening. Jan ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] ANAME discussion
Jan Včelák submitted a GitHub issue about loop detection which I think should be discussed by the wg not just the authors. https://github.com/each/draft-aname/issues/45 The -02 draft requires that CNAME+ANAME chains are chased to their ultimate target. There are a few reasons for this: * It is more CNAME-like (though the draft doesn't amend CNAME's behaviour to chase ANAMEs.) * It reduces the amount of TTL stretching that can occur if there is an ANAME chain. If these requirements are relaxed then it makes sense to chase chains less enthusiastically. WRT loop detection, it is much easier if the additional section in the response from the resolver contains the chain(s). The draft doesn't specify that at the moment; maybe it should. Tony. -- f.anthony.n.finchhttp://dotat.at/ Bailey: North 5 to 7, occasionally gale 8 at first. Very rough, becoming rough later in west. Thundery wintry showers. Good, occasionally poor.___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] ANAME discussion
On Tue, Apr 2, 2019 at 6:03 PM Tony Finch wrote: > > WRT loop detection, it is much easier if the additional section in the > response from the resolver contains the chain(s). The draft doesn't > specify that at the moment; maybe it should. Why is it easier? I would think some people may even want to hide the chain, even though it doesn't exactly hide the provider behind the final IP. ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] ANAME discussion
Olli Vanhoja wrote: > On Tue, Apr 2, 2019 at 6:03 PM Tony Finch wrote: > > > > WRT loop detection, it is much easier if the additional section in the > > response from the resolver contains the chain(s). The draft doesn't > > specify that at the moment; maybe it should. > > Why is it easier? Maybe it isn't and I wasn't thinking carefully :-) Tony. -- f.anthony.n.finchhttp://dotat.at/ Southeast Iceland: Northerly 7 to severe gale 9, veering northeasterly 5 or 6, then becoming variable 4 later in west. Very rough or high, becoming moderate or rough later. Mainly fair. Good, occasionally poor at first. ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] ANAME discussion
On 4/2/19 7:31 PM, Olli Vanhoja wrote: > On Tue, Apr 2, 2019 at 6:03 PM Tony Finch wrote: >> WRT loop detection, it is much easier if the additional section in the >> response from the resolver contains the chain(s). The draft doesn't >> specify that at the moment; maybe it should. > Why is it easier? I would think some people may even want to hide the > chain, even though it doesn't exactly hide the provider behind the > final IP. If you return an empty SERVFAIL, your client (resolver) can't know it shouldn't retry and can't know how long not to retry. I posted more details on the GitHub ticket. --Vladimir (Knot Resolver) ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] ANAME discussion
One thing I have been pondering is multi-target aliases. (But I haven't posted about it here because I don't think the suggestion will get very far!) Multi-aliases would be useful in some situations when I would like to be able to model systems at a higher level, for things like mx.cam.ac.uk which is a round-robin alias for addresses hosted on several servers. See also: https://blog.mythic-beasts.com/2019/03/22/round-robin-dns-another-use-for-anames/ https://github.com/each/draft-aname/issues/11 As Evan Hunt says in that issue, this is a huge can of worms. There are fun problems like a billion laughs attack on resolvers that try to chase down ANAME/CNAME chains to the ultimate target addresses... Tony. -- f.anthony.n.finchhttp://dotat.at/ Great Orme Head to the Mull of Galloway: Northwest 5 to 7, veering east or northeast 4 or 5, increasing 6 at times. Slight or moderate, occasionally rough far at first in far west. Rain or showers, occasionally thundery at first. Good, occasionally poor at first. ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] ANAME discussion
On Tue, Apr 2, 2019 at 5:54 PM Tony Finch wrote: > WRT loop detection, it is much easier if the additional section in the > response from the resolver contains the chain(s). The draft doesn't > specify that at the moment; maybe it should. I meant a situation where an authoritative server is doing the sibling address record substitution using an external resolver. Imagine the following ANAME loop: foo. ANAME bar. bar. ANAME foo. For simplification, expect the zones to live on different authoritative servers and also that the ANAME processing triggers with the first query. The resolution steps will look something like this: 1. Authoritative receives a query for foo. 2. Authoritative finds the ANAME and calls out to the resolver asking for bar. 3. Resolver sends a query for bar to the authoritative. 4. Authoritative finds the ANAME and calls out to the resolver asking for foo. 5. goto 1 The authoritative server acting as a stub resolver doesn't have full context of the resolution chain and therefore cannot break the loop. We would have to pass around additional context in the queries and I'm not sure if DNS firewalls would be happy to see messages with QR = 0 and ARCOUNT > 0. Jan ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] ANAME discussion
This loop is one reason of several to eliminate inline resolution for ANAME if possible and minimize it otherwise, but is not quite as bad as it seems because all involved servers can—and should—avoid issuing queries that are redundant with an already-active request. But even if they don't, the early queries eventually time out and rate limiting eventually detects and caps the runaway load. In other words, this misconfiguration does not create any new vulnerabilities, and existing mechanisms are already sufficient to handle it (although the document should explicitly mention them to avoid subjecting new implementers to unnecessarily painful lessons). On 4/9/19 08:09, Jan Včelák wrote: On Tue, Apr 2, 2019 at 5:54 PM Tony Finch wrote: WRT loop detection, it is much easier if the additional section in the response from the resolver contains the chain(s). The draft doesn't specify that at the moment; maybe it should. I meant a situation where an authoritative server is doing the sibling address record substitution using an external resolver. Imagine the following ANAME loop: foo. ANAME bar. bar. ANAME foo. For simplification, expect the zones to live on different authoritative servers and also that the ANAME processing triggers with the first query. The resolution steps will look something like this: 1. Authoritative receives a query for foo. 2. Authoritative finds the ANAME and calls out to the resolver asking for bar. 3. Resolver sends a query for bar to the authoritative. 4. Authoritative finds the ANAME and calls out to the resolver asking for foo. 5. goto 1 The authoritative server acting as a stub resolver doesn't have full context of the resolution chain and therefore cannot break the loop. We would have to pass around additional context in the queries and I'm not sure if DNS firewalls would be happy to see messages with QR = 0 and ARCOUNT > 0. Jan ___ DNSOP mailing list DNSOP@ietf.org https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ietf.org_mailman_listinfo_dnsop&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=-o8MJF7i0TzXAJRB0ncfTVfWKSyTG7nl_iTLU_A2B7c&m=4nTLZAsnHCwTJyrARtQ8uzJN8jmKg6JeQX9gDiHuHhc&s=O9ORRXkRs5TFBIKPXCdq6ck3K88lw-t7xDcNwI-ecMU&e= ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] ANAME discussion
On 4/9/19 3:38 PM, Richard Gibson wrote: > This loop is one reason of several to eliminate inline resolution for > ANAME if possible and minimize it otherwise, but is not quite as bad > as it seems because all involved servers can—and should—avoid issuing > queries that are redundant with an already-active request. But even if > they don't, the early queries eventually time out and rate limiting > eventually detects and caps the runaway load. > > In other words, this misconfiguration does not create any new > vulnerabilities, and existing mechanisms are already sufficient to > handle it (although the document should explicitly mention them to > avoid subjecting new implementers to unnecessarily painful lessons). I can't even see a simple way of detecting this. At least in the implementation suggested by Jan where you have an authoritative that calls out to a resolver (which calls out to authoritatives...) - it would need some magic that somehow links one query of the cycle to the other but regular DNS queries do not currently carry such information AFAIK. Am I missing some obvious approach? --Vladimir (Knot Resolver) ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] ANAME discussion
If an implementation has a resolver, then that component is the logical place for deduplication (e.g., the second inbound query for a given ANAME target does not result in a second outbound query, but rather waits on completion of the first). On 4/9/19 11:15, Vladimír Čunát wrote: On 4/9/19 3:38 PM, Richard Gibson wrote: This loop is one reason of several to eliminate inline resolution for ANAME if possible and minimize it otherwise, but is not quite as bad as it seems because all involved servers can—and should—avoid issuing queries that are redundant with an already-active request. But even if they don't, the early queries eventually time out and rate limiting eventually detects and caps the runaway load. In other words, this misconfiguration does not create any new vulnerabilities, and existing mechanisms are already sufficient to handle it (although the document should explicitly mention them to avoid subjecting new implementers to unnecessarily painful lessons). I can't even see a simple way of detecting this. At least in the implementation suggested by Jan where you have an authoritative that calls out to a resolver (which calls out to authoritatives...) - it would need some magic that somehow links one query of the cycle to the other but regular DNS queries do not currently carry such information AFAIK. Am I missing some obvious approach? --Vladimir (Knot Resolver) ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] ANAME discussion
Vladimír Čunát wrote: > > I can't even see a simple way of detecting this. At least in the > implementation suggested by Jan where you have an authoritative that > calls out to a resolver (which calls out to authoritatives...) You could prevent the loop from leading to a circular dependency, rather than detecting the loop, e.g. if the auth always answers from zone or cache which are updated asynchronously. Maybe the auth's resolver could chase the chain by making ANAME queries; when the auth replies it can reply from zone data and skip filling in the additional section if it doesn't have fresh address records. The auth can be more eager to make recursive queries when it gets A or queries. Tony. -- f.anthony.n.finchhttp://dotat.at/ promote human rights and open government___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop
Re: [DNSOP] ANAME discussion
On 4/9/19 6:44 PM, Richard Gibson wrote: > If an implementation has a resolver, then that component is the > logical place for deduplication (e.g., the second inbound query for a > given ANAME target does not result in a second outbound query, but > rather waits on completion of the first). Oh, right, that will simply solve the worst parts (not caching though). With many resolver instances I imagine it would be a bit more difficult to do efficiently, but it might be too soon to worry about such details. ___ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop