Re: [DNSOP] Status of draft-ietf-dnsop-dns-error-reporting

2021-11-12 Thread Manu Bretelle
Hi Roy,

It seems those 2 paragraphs are conflicting with each others:

On the one hand the aggressive use of DNSSEC-validated cache is suggested
for the reporting agent:

```

This caching is essential.  It ensures that the number of reports
   sent by a reporting resolver for the same problem is dampened, i.e.
   once per TTL, however, certain optimizations such as [RFC8020
] and

   [RFC8198 ] may reduce the
number of error reporting queries as well.
```

But on the other hand, it is not recommended to sign the reporting agent domain.

```
A solution is to avoid DNSSEC for the reporting agent domain.
   Signing the agent domain will incur an additional burden on the
   reporting resolver, as it has to validate the response.  However,

   this response has no utility to the reporting resolver.
```

Manu


On Tue, Nov 9, 2021 at 3:07 PM Roy Arends  wrote:

> Dear WG,
>
> After the October 26, IETF DNSOP interim WG on DNS Error Reporting, the
> document editors have made the following changes to reflect the discussion:
>
> Change 1) Due to qname minimisation, the reporting agent may not know that
> the reported string has been shortened. There were a few options suggested,
> such as adding a label counter. However, the most straightforward option
> seemed to be to start the reporting query with an _er label as well.
>
> Change 2) There was an observation by developers that some authoritative
> servers do not parse (unknown) EDNS0 options correctly, leading to an
> additional roundtrip by the resolver. It was suggested that authoritative
> servers could return the new EDNS0 option “unsolicited”. This is already
> the case for Extended DNS errors. We have adopted this suggestion. It was
> also pointed out that this kind of unsolicited behaviour can be surveyed.
> We believe that one such effort is underway.
>
> Change 3) There as a lot of descriptive text what implementations should
> and shouldn’t do, and what configurations should and shouldn’t do. This was
> found to be overly descriptive and pedantic, and has now been removed.
>
> There was a request to put the markdown version of the document in GitHub.
> This has now been placed here:
> https://github.com/RoyArends/draft-ietf-dnsop-dns-error-reporting
>
> New version:
> https://datatracker.ietf.org/doc/html/draft-ietf-dnsop-dns-error-reporting-01.txt
> Diffs:
> https://www.ietf.org/rfcdiff?url2=draft-ietf-dnsop-dns-error-reporting-01
>
> Warm regards,
>
> Roy Arends
>
> ___
> DNSOP mailing list
> DNSOP@ietf.org
> https://www.ietf.org/mailman/listinfo/dnsop
>
___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Status of draft-ietf-dnsop-dns-error-reporting

2021-11-12 Thread Manu Bretelle
Hi,

On Fri, Nov 12, 2021 at 5:24 AM Petr Špaček  wrote:

> On 12. 11. 21 7:42, Manu Bretelle wrote:
> > Hi Roy,
> >
> > I like the idea of an out-of-band error reporting and therefore I like
> > the proposition of this draft.
> >
> > One of the things I have a hard time visualizing though is how this
> > could be used for more than reporting DNSSEC specific errors. With the
> > option not being signed in the first place, it does not seem that DNSSEC
> > is a requirement to be able to leverage this functionality, hence it
> > would be great to think how we can make this work for more than
> > DNSSEC-only errors.
>
> E.g. it can conceivably report errors like "resolver had to fallback to
> Nth server because the first one we tried times out". Is that a
> sufficient example?
>

I suppose it could. Another one which may already fit in the EDE error code
could be EDE Code 3, "Stale Answer",
https://www.rfc-editor.org/rfc/rfc8914.html#name-extended-dns-error-code-3-s
as an example.

Some others I have a harder time understanding their value could be EDE
Code 20, "Not Authoritative",
https://www.rfc-editor.org/rfc/rfc8914.html#name-extended-dns-error-code-20-
.
On one hand, this is log you already have as an auth operator, but on the
other hand, through the reporting endpoint, and ignoring possible abuses of
said endpoint, you would get a peek at the resolver view, not just any
unsolicited request that was sent to your auth server, making it easier to
track broken delegation.


> > As it is, the requirement for the EDNS0 option to be in the response,
> > while it does offer some properties such as controlling sampling rate…,
> > essentially will prevent any report of answers which are not properly
> > formatted in the first place, or never received like when a resolver is
> > not able to reach any authorities for a given name, when resolver start
> > falling back on staled data, and possibly in the future, failing to
> > reach over an advertised encrypted channel… There is likely value for an
> > authoritative resolver operator to be able to get report for those
> > issues too.
>
> While I agree with the sentiment that reporting other issues would be
> also useful, I think that _for now_ we should keep the scope limited to
> situations which do not require any extra state in resolvers.
>
> That is, reporting "no server is reachable" requires prior information
> stored or reachable somewhere else, which is IMHO order of magnitude
> more complex task. Let's get experience with simple error reporting
> first and only then move forward to more complex tasks...
>

I am more than happy to have an iterative approach to this. My concern was
that this solution would be the end-goal, essentially closing
possibilities for other type of errors such as the ones mentioned.


>
> > The title of the draft: "DNS Error Reporting" would let one believe that
> > it is a somewhat generic mechanism, but I don't think it is as is.
>
> I disagree here. It is a generic mechanism, see the first response
> paragraph in this e-mail.


This sentence was coming in block with the rest of the paragraph below for
illustration.

>
> > Actually, while DNSSEC is not named in the title/abstract, the examples
> > in the abstract are DNSSEC specific, the wording in the rest of the
> > document refers for the most part to "validating resolvers". Should this
> > be a "DNSSEC Error Reporting" draft? or a "DNS Error Reporting" draft,
> > but then the function of "validating" itself should be less emphasized?
> > While a validating resolver can report more type of errors than a
> > non-validating resolvers, validation is not a requirement to be able to
> > report.
>
> Agreed, but I really don't feel the problem as severe. Would it be
> sufficient to add more examples of non-DNSSEC errors?
>

Yes, I think a non-DNSSEC error could help, along with not using
"validating" outside the scope of DNSSEC specific errors. As an example, in
the terminology, the reporting resolver is a validating resolver:

> Reporting Resolver: In the context of this document, the term
   reporting resolver is used as a shorthand for a validating recursive
   resolver that supports DNS Error Reporting



>
> > On Tue, Nov 9, 2021 at 3:07 PM Roy Arends  > > wrote:
> >
> > Dear WG,
> >
> > Change 3) There as a lot of descriptive text what implementations
> > should and shouldn’t do, and what configurations should and
> > shouldn’t do. This was found to be overly descriptive and pedantic,
> > and has now been removed.
> >
> >
> > I see that the security consideration about not reporting errors from an
> > encrypted channel (over a supposedly unencrypted channel) has been
> > removed. Wouldn’t it make sense to leave it in order to avoid leaking
> > traffic for queries that were not previously visible on the network?
> > Possibly requiring than an encrypted channel (equal or stronger, for
> > whatever definition that may be) is used to send such r

Re: [DNSOP] Status of draft-ietf-dnsop-dns-error-reporting

2021-11-12 Thread Petr Špaček

On 12. 11. 21 7:42, Manu Bretelle wrote:

Hi Roy,

I like the idea of an out-of-band error reporting and therefore I like 
the proposition of this draft.


One of the things I have a hard time visualizing though is how this 
could be used for more than reporting DNSSEC specific errors. With the 
option not being signed in the first place, it does not seem that DNSSEC 
is a requirement to be able to leverage this functionality, hence it 
would be great to think how we can make this work for more than 
DNSSEC-only errors.


E.g. it can conceivably report errors like "resolver had to fallback to 
Nth server because the first one we tried times out". Is that a 
sufficient example?



As it is, the requirement for the EDNS0 option to be in the response, 
while it does offer some properties such as controlling sampling rate…, 
essentially will prevent any report of answers which are not properly 
formatted in the first place, or never received like when a resolver is 
not able to reach any authorities for a given name, when resolver start 
falling back on staled data, and possibly in the future, failing to 
reach over an advertised encrypted channel… There is likely value for an 
authoritative resolver operator to be able to get report for those 
issues too.


While I agree with the sentiment that reporting other issues would be 
also useful, I think that _for now_ we should keep the scope limited to 
situations which do not require any extra state in resolvers.


That is, reporting "no server is reachable" requires prior information 
stored or reachable somewhere else, which is IMHO order of magnitude 
more complex task. Let's get experience with simple error reporting 
first and only then move forward to more complex tasks...



The title of the draft: "DNS Error Reporting" would let one believe that 
it is a somewhat generic mechanism, but I don't think it is as is. 


I disagree here. It is a generic mechanism, see the first response 
paragraph in this e-mail.


Actually, while DNSSEC is not named in the title/abstract, the examples 
in the abstract are DNSSEC specific, the wording in the rest of the 
document refers for the most part to "validating resolvers". Should this 
be a "DNSSEC Error Reporting" draft? or a "DNS Error Reporting" draft, 
but then the function of "validating" itself should be less emphasized? 
While a validating resolver can report more type of errors than a 
non-validating resolvers, validation is not a requirement to be able to 
report.


Agreed, but I really don't feel the problem as severe. Would it be 
sufficient to add more examples of non-DNSSEC errors?



On Tue, Nov 9, 2021 at 3:07 PM Roy Arends > wrote:


Dear WG,

Change 3) There as a lot of descriptive text what implementations
should and shouldn’t do, and what configurations should and
shouldn’t do. This was found to be overly descriptive and pedantic,
and has now been removed.


I see that the security consideration about not reporting errors from an 
encrypted channel (over a supposedly unencrypted channel) has been 
removed. Wouldn’t it make sense to leave it in order to avoid leaking 
traffic for queries that were not previously visible on the network? 
Possibly requiring than an encrypted channel (equal or stronger, for 
whatever definition that may be) is used to send such reports if needed? 
This would also make sure the mechanism is going to work once the ADo* 
mechanisms are ironed out.


AFAIK it was removed because the only things we could place there were 
extremely vague and probably not implementable anyway.


Reason: There is _no such thing_ as 1:1 mapping between client queries 
and outgoing answers, which makes it super hard to define anything sensible.


A simple example:

1. Client A asks for
login.secret.facebook.com
over plain UDP (and is now waiting for resolver's answer).

2. Resolver starts recursing and eventually sends query for 
secret.facebook.com NS over UDP (client sent query over plain UDP, 
right?). At this point the query was sent but answer was not received yet


3. Client B asks for
supersecretdomainnobodyshouldsee.secret.facebook.com
over TLS

4. Resolver deduplicates the query for secret.facebook.com NS, i.e. 
queries (1) and (3) are now waiting for the same packet - delegation 
from facebook.com to secret.facebook.com.


5. If this deduplicated query for secret.facebook.com NS failed and came 
back with error reporting option, what should the resolver do now? We 
have two clients waiting for it. Is the query considered "secret" or 
not? If the client B (packet in step 3.) arrived couple ms later it 
would not be secret?


In short: This way madness lies.

The only sane way to implement "never leak queries to plaintext" policy 
is to operate TLS-only resolver and do not permit non-TLS 
clients/queries. Then you can disable the error reporting feature 
completely ...



Having said that, we can have _some_ text in Security considerations 
section

Re: [DNSOP] Status of draft-ietf-dnsop-dns-error-reporting

2021-11-11 Thread Manu Bretelle
Hi Roy,

I like the idea of an out-of-band error reporting and therefore I like the
proposition of this draft.

One of the things I have a hard time visualizing though is how this could
be used for more than reporting DNSSEC specific errors. With the option not
being signed in the first place, it does not seem that DNSSEC is a
requirement to be able to leverage this functionality, hence it would be
great to think how we can make this work for more than DNSSEC-only errors.

As it is, the requirement for the EDNS0 option to be in the response, while
it does offer some properties such as controlling sampling rate…,
essentially will prevent any report of answers which are not properly
formatted in the first place, or never received like when a resolver is not
able to reach any authorities for a given name, when resolver start falling
back on staled data, and possibly in the future, failing to reach over an
advertised encrypted channel… There is likely value for an authoritative
resolver operator to be able to get report for those issues too.

The title of the draft: "DNS Error Reporting" would let one believe that it
is a somewhat generic mechanism, but I don't think it is as is. Actually,
while DNSSEC is not named in the title/abstract, the examples in the
abstract are DNSSEC specific, the wording in the rest of the document
refers for the most part to "validating resolvers". Should this be a
"DNSSEC Error Reporting" draft? or a "DNS Error Reporting" draft, but then
the function of "validating" itself should be less emphasized? While a
validating resolver can report more type of errors than a non-validating
resolvers, validation is not a requirement to be able to report.


On Tue, Nov 9, 2021 at 3:07 PM Roy Arends  wrote:

> Dear WG,
>
> Change 3) There as a lot of descriptive text what implementations should
> and shouldn’t do, and what configurations should and shouldn’t do. This was
> found to be overly descriptive and pedantic, and has now been removed.
>

I see that the security consideration about not reporting errors from an
encrypted channel (over a supposedly unencrypted channel) has been removed.
Wouldn’t it make sense to leave it in order to avoid leaking traffic for
queries that were not previously visible on the network? Possibly requiring
than an encrypted channel (equal or stronger, for whatever definition that
may be) is used to send such reports if needed? This would also make sure
the mechanism is going to work once the ADo* mechanisms are ironed out.

Thanks,
Manu


>
> There was a request to put the markdown version of the document in GitHub.
> This has now been placed here:
> https://github.com/RoyArends/draft-ietf-dnsop-dns-error-reporting
>
> New version:
> https://datatracker.ietf.org/doc/html/draft-ietf-dnsop-dns-error-reporting-01.txt
> Diffs:
> https://www.ietf.org/rfcdiff?url2=draft-ietf-dnsop-dns-error-reporting-01
>
> Warm regards,
>
> Roy Arends
>
> ___
> DNSOP mailing list
> DNSOP@ietf.org
> https://www.ietf.org/mailman/listinfo/dnsop
>
___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Status of draft-ietf-dnsop-dns-error-reporting

2021-11-10 Thread Roy Arends

> On 10 Nov 2021, at 09:35, libor.peltan  wrote:
> 
> Hi Roy,
> 
>> Change 2) There was an observation by developers that some authoritative 
>> servers do not parse (unknown) EDNS0 options correctly, leading to an 
>> additional roundtrip by the resolver. It was suggested that authoritative 
>> servers could return the new EDNS0 option “unsolicited”. This is already the 
>> case for Extended DNS errors. We have adopted this suggestion. It was also 
>> pointed out that this kind of unsolicited behaviour can be surveyed. We 
>> believe that one such effort is underway.
> 
> Let me express my personal opinion here.

Thanks! I really appreciate feedback on this! Keep it coming!

> While sending unsolicited EDE seems fine for me as it's just few bytes, the 
> error-reporting address might be usually roughly 100 bytes long,

Why would that be 100 bytes long? An error-reporting domain should be kept 
rather short.

> so sending it with very every response may lead to perceptible increase in 
> traffic, including increase in TCP fallbacks.

Would it help to require the authoritative server to only add this option when 
there is space to do so?

> This may be tolerable, if there were some better reason for it. But I don't 
> like argumenting with broken implementations. Always dodging broken 
> implementation only leads to more broken implementations (see DNS Flag Day 
> etc). In ideal case, we should aim for the state where broken implementation 
> are failing constantly.

This is not that! If we were sending new EDNS0 options to authoritative 
servers, it will lead to additional round-trips to dodge broken servers. This 
is the way of “dodging broken implementations”. It won’t get these 
implementations fixed, and this additional resolver code to route around 
brokenness in the field will eventually end up at flag-day. 

Consider the current method of returning unsolicited new options in responses: 
A resolver may not handle unsolicited new EDNS0 options. They will either be 
fixed or not be used. This is not a negotiation, unless the resolver falls back 
to send a query without EDNS0. I have been told by developers that there are 
more broken authoritative server software out there than broken resolver 
software.

Field tests are taking place to measure impact.

Hope this helps!

Warmly,

Roy
___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


Re: [DNSOP] Status of draft-ietf-dnsop-dns-error-reporting

2021-11-10 Thread libor.peltan

Hi Roy,

Change 2) There was an observation by developers that some 
authoritative servers do not parse (unknown) EDNS0 options correctly, 
leading to an additional roundtrip by the resolver. It was suggested 
that authoritative servers could return the new EDNS0 option 
“unsolicited”. This is already the case for Extended DNS errors. We 
have adopted this suggestion. It was also pointed out that this kind 
of unsolicited behaviour can be surveyed. We believe that one such 
effort is underway.


Let me express my personal opinion here.

While sending unsolicited EDE seems fine for me as it's just few bytes, 
the error-reporting address might be usually roughly 100 bytes long, so 
sending it with very every response may lead to perceptible increase in 
traffic, including increase in TCP fallbacks.


This may be tolerable, if there were some better reason for it. But I 
don't like argumenting with broken implementations. Always dodging 
broken implementation only leads to more broken implementations (see DNS 
Flag Day etc). In ideal case, we should aim for the state where broken 
implementation are failing constantly.


Libor

___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop


[DNSOP] Status of draft-ietf-dnsop-dns-error-reporting

2021-11-09 Thread Roy Arends
Dear WG, 

After the October 26, IETF DNSOP interim WG on DNS Error Reporting, the 
document editors have made the following changes to reflect the discussion:

Change 1) Due to qname minimisation, the reporting agent may not know that the 
reported string has been shortened. There were a few options suggested, such as 
adding a label counter. However, the most straightforward option seemed to be 
to start the reporting query with an _er label as well.

Change 2) There was an observation by developers that some authoritative 
servers do not parse (unknown) EDNS0 options correctly, leading to an 
additional roundtrip by the resolver. It was suggested that authoritative 
servers could return the new EDNS0 option “unsolicited”. This is already the 
case for Extended DNS errors. We have adopted this suggestion. It was also 
pointed out that this kind of unsolicited behaviour can be surveyed. We believe 
that one such effort is underway.

Change 3) There as a lot of descriptive text what implementations should and 
shouldn’t do, and what configurations should and shouldn’t do. This was found 
to be overly descriptive and pedantic, and has now been removed.

There was a request to put the markdown version of the document in GitHub. This 
has now been placed here: 
https://github.com/RoyArends/draft-ietf-dnsop-dns-error-reporting 


New version: 
https://datatracker.ietf.org/doc/html/draft-ietf-dnsop-dns-error-reporting-01.txt
 

Diffs: 
https://www.ietf.org/rfcdiff?url2=draft-ietf-dnsop-dns-error-reporting-01 


Warm regards,

Roy Arends

___
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop