[DNSOP] ANAME discussion

2019-03-29 Thread Tony Finch
(Starting a new thread so my mailer doesn't sort the new discussion with
messages from November!)

Thanks to Matthijs Mekking for the good summary this morning. I am happy
for someone else to take over editorial/authorship duties on the draft.

There were several useful points at the mic; thanks Paul Hoffman for
noting them in the minutes (especially because I could not remember who
said what). In no particular order...

Petr Spacek said he thought zone transfers are difficult. In earlier
revisions that is true; in the -02 revision there is nothing special about
zone transfers, they are just the same as in the absence of ANAME. All the
work has been moved to how the records get into the zone. (However the
huge "as if" clause allows implementations to do funky things at zone
transfer time if they want, and you won't be able to tell the difference
from outside.)

Matt Pounsett talked about strategies for polling the DNS at scale
efficiently. Timing was something I worried about a lot when writing the
draft, and I probably over-specified it. I eventually concluded that it
isn't possible in many reasonable implementations to avoid significantly
stretching the TTL of the ANAME target address records by copying them to
the ANAME siblings. Maybe the spec should become more relaxed about this,
and make it a quality-of-implementation issue rather than a requirement.

Regarding geoip, ANAME is no worse than manually configured address
records, and it will eventually become better as resolver support is
deployed. Geoip is the main reason for specifying (optional) ANAME support
in resolvers. (I guess another reason might be to more faithfully respect
the target address TTLs...) Resolver support isn't necessary, though,
provided the ANAME target addresses work everywhere. So it might not be
compatible with all existing geoip implementations, but I don't think
that's a bug in ANAME.

Scalability is definitely a challenge. The worst case is when you have a
very large number of zones sharing the same ANAME target, and that changes
address. If ANAME is implemented using existing standard protocol features
(i.e. UPDATE, NOTIFY, IXFR) as specified, then the change of address is
going to cause a huge volume of modification traffic. But the "as if"
clause exists to allow large scale implementations to do clever things to
mitigate this.

On the list, Brian Dickson and Dan York talked about "just as if there was
a CNAME there”. The -02 draft tries to have semantics very close to CNAME,
but there are other interesting possibilities if we relax that
requirement.

What I had in mind for an -03 revision was to remove any requirements
about how the sibling address records are populated. The remaining
requirement is that the sibling addresses must work the same way as the
target addresses, from the point of view of the end user. The sibling
addresses can freely be replaced by the target addresses at any time
(provided whatever is doing the substitution can sign the records when
that is necessary).

This covers both draft -01 and -02 style implementations, and Oracle Dyn's
notion of fallback addresses, and perhaps even vertically integrated
providers that might prefer to direct client to thei own http 302 server
instead of substituting addresses in the DNS. (Though they risk making too
many assumptions about what the point of view of the end user is!)

Dunno if this is a useful direction or not - argue away :-)

Tony.
-- 
f.anthony.n.finchhttp://dotat.at/
St Davids Head to Great Orme Head, including St Georges Channel: Southwest 3
or 4, occasionally 5 at first, becoming variable 3 or less, then north 4 or 5
later. Smooth or slight, becoming slight or moderate later in north. Fair.
Good.___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] ANAME discussion

2019-03-30 Thread Olli Vanhoja
On Fri, Mar 29, 2019 at 9:59 PM Tony Finch  wrote:
>
> Thanks to Matthijs Mekking for the good summary this morning. I am happy
> for someone else to take over editorial/authorship duties on the draft.
>

I would be more than happy to help with this draft and to get in
through the process.

___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] ANAME discussion

2019-03-30 Thread Matthijs Mekking

Tony,

Thanks for listing these points.  I converted them to issues (together 
with some other issues that people mentioned last week and on the list).


  https://github.com/each/draft-aname/issues

I am more than happy to take on the editorial work.

Best regards,

Matthijs


On 3/29/19 9:58 PM, Tony Finch wrote:

(Starting a new thread so my mailer doesn't sort the new discussion with
messages from November!)

Thanks to Matthijs Mekking for the good summary this morning. I am happy
for someone else to take over editorial/authorship duties on the draft.

There were several useful points at the mic; thanks Paul Hoffman for
noting them in the minutes (especially because I could not remember who
said what). In no particular order...

Petr Spacek said he thought zone transfers are difficult. In earlier
revisions that is true; in the -02 revision there is nothing special about
zone transfers, they are just the same as in the absence of ANAME. All the
work has been moved to how the records get into the zone. (However the
huge "as if" clause allows implementations to do funky things at zone
transfer time if they want, and you won't be able to tell the difference
from outside.)

Matt Pounsett talked about strategies for polling the DNS at scale
efficiently. Timing was something I worried about a lot when writing the
draft, and I probably over-specified it. I eventually concluded that it
isn't possible in many reasonable implementations to avoid significantly
stretching the TTL of the ANAME target address records by copying them to
the ANAME siblings. Maybe the spec should become more relaxed about this,
and make it a quality-of-implementation issue rather than a requirement.

Regarding geoip, ANAME is no worse than manually configured address
records, and it will eventually become better as resolver support is
deployed. Geoip is the main reason for specifying (optional) ANAME support
in resolvers. (I guess another reason might be to more faithfully respect
the target address TTLs...) Resolver support isn't necessary, though,
provided the ANAME target addresses work everywhere. So it might not be
compatible with all existing geoip implementations, but I don't think
that's a bug in ANAME.

Scalability is definitely a challenge. The worst case is when you have a
very large number of zones sharing the same ANAME target, and that changes
address. If ANAME is implemented using existing standard protocol features
(i.e. UPDATE, NOTIFY, IXFR) as specified, then the change of address is
going to cause a huge volume of modification traffic. But the "as if"
clause exists to allow large scale implementations to do clever things to
mitigate this.

On the list, Brian Dickson and Dan York talked about "just as if there was
a CNAME there”. The -02 draft tries to have semantics very close to CNAME,
but there are other interesting possibilities if we relax that
requirement.

What I had in mind for an -03 revision was to remove any requirements
about how the sibling address records are populated. The remaining
requirement is that the sibling addresses must work the same way as the
target addresses, from the point of view of the end user. The sibling
addresses can freely be replaced by the target addresses at any time
(provided whatever is doing the substitution can sign the records when
that is necessary).

This covers both draft -01 and -02 style implementations, and Oracle Dyn's
notion of fallback addresses, and perhaps even vertically integrated
providers that might prefer to direct client to thei own http 302 server
instead of substituting addresses in the DNS. (Though they risk making too
many assumptions about what the point of view of the end user is!)

Dunno if this is a useful direction or not - argue away :-)

Tony.


___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop



___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] ANAME discussion

2019-04-02 Thread Jan Včelák
On Fri, Mar 29, 2019 at 9:58 PM Tony Finch wrote:
> There were several useful points at the mic; thanks Paul Hoffman for
> noting them in the minutes (especially because I could not remember who
> said what). In no particular order...

Tim also mentioned that the vendors have their own secret sauce for handling
the ANAME-like records. Let me share some details about the NS1's ALIAS
record implementation. I hope it will be helpful when thinking about the
remaining edge cases and also as an input for new implementers.

The most significant difference from the current draft is that sibling
A/ records have precedence over the ALIAS record. I think this behavior
is closer to how CNAME is usually processed by a DNS server (i.e. first try
to match the QTYPE then check if there is a CNAME) however I don't find this
reasoning determinant. One advantage is that you can configure static 
record and use ALIAS to resolve only the A.

The ALIAS records are resolved by our edge servers when needed and the
result is cached. Nothing is written into the zone. We always resolve both A
and  at the same time and use the minimal TTL of the A and  to
determine how long to keep the result in the cache. The current TTL is also
used in the answers (i.e. you will see the value drop when querying the same
server several times).

Also, if A nor  exist at the ALIAS target, our authoritative server
responds with SERVFAIL to indicate misconfiguration of the record. The ANAME
from the draft would result in NODATA. We prefer SERVFAIL as we don't want
the resolver to cache NODATA if there is an interim problem resolving the
ALIAS target.

Last but not least, we strip ALIAS from zone transfers.

I guess that's all about the "secret sauce". We would like to move from
ALIAS to ANAME when the draft becomes stable and other implementations start
to emerge.

The critical change for us is likely the A/ vs ANAME priorities. In case
the primary server adds sibling A/ to the ANAME for compatibility with
old resolvers, our implementation would always use the fallback records and
ignore the ANAME if the zone was ingested over a transfer. We also have
existing users that rely on the current behavior and we have to check that
we won't break their setup if we change anything about the processing. I
believe this aspect of sibling A/ was already discovered in a context of
"zombie records" mentioned in https://github.com/each/draft-aname/issues/25.

There will be some challenges but I'm really happy that ANAME is happening.

Jan

___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] ANAME discussion

2019-04-02 Thread Tony Finch
Jan Včelák submitted a GitHub issue about loop detection which I think
should be discussed by the wg not just the authors.
https://github.com/each/draft-aname/issues/45

The -02 draft requires that CNAME+ANAME chains are chased to their
ultimate target. There are a few reasons for this:

* It is more CNAME-like (though the draft doesn't amend CNAME's behaviour
  to chase ANAMEs.)

* It reduces the amount of TTL stretching that can occur if there is an
  ANAME chain.

If these requirements are relaxed then it makes sense to chase chains less
enthusiastically.

WRT loop detection, it is much easier if the additional section in the
response from the resolver contains the chain(s). The draft doesn't
specify that at the moment; maybe it should.

Tony.
-- 
f.anthony.n.finchhttp://dotat.at/
Bailey: North 5 to 7, occasionally gale 8 at first. Very rough, becoming rough
later in west. Thundery wintry showers. Good, occasionally poor.___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] ANAME discussion

2019-04-02 Thread Olli Vanhoja
On Tue, Apr 2, 2019 at 6:03 PM Tony Finch  wrote:
>
> WRT loop detection, it is much easier if the additional section in the
> response from the resolver contains the chain(s). The draft doesn't
> specify that at the moment; maybe it should.

Why is it easier? I would think some people may even want to hide the
chain, even though it doesn't exactly hide the provider behind the
final IP.

___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] ANAME discussion

2019-04-02 Thread Tony Finch
Olli Vanhoja  wrote:
> On Tue, Apr 2, 2019 at 6:03 PM Tony Finch  wrote:
> >
> > WRT loop detection, it is much easier if the additional section in the
> > response from the resolver contains the chain(s). The draft doesn't
> > specify that at the moment; maybe it should.
>
> Why is it easier?

Maybe it isn't and I wasn't thinking carefully :-)

Tony.
-- 
f.anthony.n.finchhttp://dotat.at/
Southeast Iceland: Northerly 7 to severe gale 9, veering northeasterly 5 or 6,
then becoming variable 4 later in west. Very rough or high, becoming moderate
or rough later. Mainly fair. Good, occasionally poor at first.

___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] ANAME discussion

2019-04-02 Thread Vladimír Čunát
On 4/2/19 7:31 PM, Olli Vanhoja wrote:
> On Tue, Apr 2, 2019 at 6:03 PM Tony Finch  wrote:
>> WRT loop detection, it is much easier if the additional section in the
>> response from the resolver contains the chain(s). The draft doesn't
>> specify that at the moment; maybe it should.
> Why is it easier? I would think some people may even want to hide the
> chain, even though it doesn't exactly hide the provider behind the
> final IP.

If you return an empty SERVFAIL, your client (resolver) can't know it
shouldn't retry and can't know how long not to retry.  I posted more
details on the GitHub ticket.

--Vladimir (Knot Resolver)

___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] ANAME discussion

2019-04-03 Thread Tony Finch
One thing I have been pondering is multi-target aliases. (But I haven't
posted about it here because I don't think the suggestion will get very
far!) Multi-aliases would be useful in some situations when I would like
to be able to model systems at a higher level, for things like
mx.cam.ac.uk which is a round-robin alias for addresses hosted on several
servers.

See also:
https://blog.mythic-beasts.com/2019/03/22/round-robin-dns-another-use-for-anames/
https://github.com/each/draft-aname/issues/11

As Evan Hunt says in that issue, this is a huge can of worms. There are
fun problems like a billion laughs attack on resolvers that try to chase
down ANAME/CNAME chains to the ultimate target addresses...

Tony.
-- 
f.anthony.n.finchhttp://dotat.at/
Great Orme Head to the Mull of Galloway: Northwest 5 to 7, veering east or
northeast 4 or 5, increasing 6 at times. Slight or moderate, occasionally
rough far at first in far west. Rain or showers, occasionally thundery at
first. Good, occasionally poor at first.

___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] ANAME discussion

2019-04-09 Thread Jan Včelák
On Tue, Apr 2, 2019 at 5:54 PM Tony Finch wrote:
> WRT loop detection, it is much easier if the additional section in the
> response from the resolver contains the chain(s). The draft doesn't
> specify that at the moment; maybe it should.

I meant a situation where an authoritative server is doing the sibling
address record substitution using an external resolver.

Imagine the following ANAME loop:
foo. ANAME bar.
bar. ANAME foo.

For simplification, expect the zones to live on different
authoritative servers and also that the ANAME processing triggers with
the first query.

The resolution steps will look something like this:
1. Authoritative receives a query for foo.
2. Authoritative finds the ANAME and calls out to the resolver asking for bar.
3. Resolver sends a query for bar to the authoritative.
4. Authoritative finds the ANAME and calls out to the resolver asking for foo.
5. goto 1

The authoritative server acting as a stub resolver doesn't have full
context of the resolution chain and therefore cannot break the loop.
We would have to pass around additional context in the queries and I'm
not sure if DNS firewalls would be happy to see messages with QR = 0
and ARCOUNT > 0.

Jan

___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] ANAME discussion

2019-04-09 Thread Richard Gibson
This loop is one reason of several to eliminate inline resolution for 
ANAME if possible and minimize it otherwise, but is not quite as bad as 
it seems because all involved servers can—and should—avoid issuing 
queries that are redundant with an already-active request. But even if 
they don't, the early queries eventually time out and rate limiting 
eventually detects and caps the runaway load.


In other words, this misconfiguration does not create any new 
vulnerabilities, and existing mechanisms are already sufficient to 
handle it (although the document should explicitly mention them to avoid 
subjecting new implementers to unnecessarily painful lessons).


On 4/9/19 08:09, Jan Včelák wrote:

On Tue, Apr 2, 2019 at 5:54 PM Tony Finch wrote:

WRT loop detection, it is much easier if the additional section in the
response from the resolver contains the chain(s). The draft doesn't
specify that at the moment; maybe it should.

I meant a situation where an authoritative server is doing the sibling
address record substitution using an external resolver.

Imagine the following ANAME loop:
foo. ANAME bar.
bar. ANAME foo.

For simplification, expect the zones to live on different
authoritative servers and also that the ANAME processing triggers with
the first query.

The resolution steps will look something like this:
1. Authoritative receives a query for foo.
2. Authoritative finds the ANAME and calls out to the resolver asking for bar.
3. Resolver sends a query for bar to the authoritative.
4. Authoritative finds the ANAME and calls out to the resolver asking for foo.
5. goto 1

The authoritative server acting as a stub resolver doesn't have full
context of the resolution chain and therefore cannot break the loop.
We would have to pass around additional context in the queries and I'm
not sure if DNS firewalls would be happy to see messages with QR = 0
and ARCOUNT > 0.

Jan

___
DNSOP mailing list
DNSOP@ietf.org
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ietf.org_mailman_listinfo_dnsop&d=DwICAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=-o8MJF7i0TzXAJRB0ncfTVfWKSyTG7nl_iTLU_A2B7c&m=4nTLZAsnHCwTJyrARtQ8uzJN8jmKg6JeQX9gDiHuHhc&s=O9ORRXkRs5TFBIKPXCdq6ck3K88lw-t7xDcNwI-ecMU&e=


___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] ANAME discussion

2019-04-09 Thread Vladimír Čunát
On 4/9/19 3:38 PM, Richard Gibson wrote:
> This loop is one reason of several to eliminate inline resolution for
> ANAME if possible and minimize it otherwise, but is not quite as bad
> as it seems because all involved servers can—and should—avoid issuing
> queries that are redundant with an already-active request. But even if
> they don't, the early queries eventually time out and rate limiting
> eventually detects and caps the runaway load.
>
> In other words, this misconfiguration does not create any new
> vulnerabilities, and existing mechanisms are already sufficient to
> handle it (although the document should explicitly mention them to
> avoid subjecting new implementers to unnecessarily painful lessons).

I can't even see a simple way of detecting this.  At least in the
implementation suggested by Jan where you have an authoritative that
calls out to a resolver (which calls out to authoritatives...) - it
would need some magic that somehow links one query of the cycle to the
other but regular DNS queries do not currently carry such information
AFAIK.  Am I missing some obvious approach?

--Vladimir (Knot Resolver)

___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] ANAME discussion

2019-04-09 Thread Richard Gibson
If an implementation has a resolver, then that component is the logical 
place for deduplication (e.g., the second inbound query for a given 
ANAME target does not result in a second outbound query, but rather 
waits on completion of the first).


On 4/9/19 11:15, Vladimír Čunát wrote:

On 4/9/19 3:38 PM, Richard Gibson wrote:

This loop is one reason of several to eliminate inline resolution for
ANAME if possible and minimize it otherwise, but is not quite as bad
as it seems because all involved servers can—and should—avoid issuing
queries that are redundant with an already-active request. But even if
they don't, the early queries eventually time out and rate limiting
eventually detects and caps the runaway load.

In other words, this misconfiguration does not create any new
vulnerabilities, and existing mechanisms are already sufficient to
handle it (although the document should explicitly mention them to
avoid subjecting new implementers to unnecessarily painful lessons).

I can't even see a simple way of detecting this.  At least in the
implementation suggested by Jan where you have an authoritative that
calls out to a resolver (which calls out to authoritatives...) - it
would need some magic that somehow links one query of the cycle to the
other but regular DNS queries do not currently carry such information
AFAIK.  Am I missing some obvious approach?

--Vladimir (Knot Resolver)



___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] ANAME discussion

2019-04-09 Thread Tony Finch
Vladimír Čunát  wrote:
>
> I can't even see a simple way of detecting this.  At least in the
> implementation suggested by Jan where you have an authoritative that
> calls out to a resolver (which calls out to authoritatives...)

You could prevent the loop from leading to a circular dependency, rather
than detecting the loop, e.g. if the auth always answers from zone or
cache which are updated asynchronously.

Maybe the auth's resolver could chase the chain by making ANAME queries;
when the auth replies it can reply from zone data and skip filling in the
additional section if it doesn't have fresh address records. The auth can
be more eager to make recursive queries when it gets A or  queries.

Tony.
-- 
f.anthony.n.finchhttp://dotat.at/
promote human rights and open government___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] ANAME discussion

2019-04-10 Thread Vladimír Čunát
On 4/9/19 6:44 PM, Richard Gibson wrote:
> If an implementation has a resolver, then that component is the
> logical place for deduplication (e.g., the second inbound query for a
> given ANAME target does not result in a second outbound query, but
> rather waits on completion of the first). 

Oh, right, that will simply solve the worst parts (not caching though). 
With many resolver instances I imagine it would be a bit more difficult
to do efficiently, but it might be too soon to worry about such details.


___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop