Re: [Uta] TLSRPT further comments

Viktor Dukhovni Thu, 07 Dec 2017 10:31:16 -0800


> On Dec 7, 2017, at 8:17 AM, Brotman, Alexander 
> <[email protected]> wrote:
> 
> 
>> Section 4.3.2.1:
>> 
>>      "dnssec-invalid": This would indicate that no valid records were
>>      returned from the recursive resolver.  The request returned with
>>      SERVFAIL for the requested TLSA record.  It should be noted that
>>      if the reporter's systems are having problems resolving
>>      destination DNS records due to DNSSEC failures, it's possible they
>>      will also be unable to resolve the TLSRPT record, therefore these
>>      types of reports may be rare.
>> 
>> Unfortunately, failure to resolve TLSA records is not yet always a result of 
>> a broader DNS outage.  A small fraction of systems have nameservers that are 
>> either behind firewalls which implement misguided filtering of queries by 
>> RRtype (blocking TLSA and other "unusual"
>> queries) or are buggy and mishandle authenticated denial of existence.
>> Therefore, the "may be rare" sentence should I think be deleted.
> 
> This seems to be a different type of operation issue.  If an admin can never 
> resolve a TLSA record from an external site, there may be larger issues going 
> on.  We can alter/remove the sentence, but I don't think it will change the 
> frequency of those types reports.


I run a roughly daily survey of the now ~5.1 million DNSSEC-signed
domains that I've gathered from various sources.  Today, among these
5.1 million domains 132 have the property that:

        0. They have a working MX host
        1. DNS lookups work fine for "typical" MX/A/AAAA/... queries.
        2. TLSA lookups consistently ServFail.

The reason is either misguided firewall that filters the query stream
or incorrect handling of DNSSEC "denial of existence", causing some or
all queries for records that don't exist to fail.  The problem is even
more common (another ~400 domains) if we remove requirement 0 and
include parked or otherwise non-email domains, but these are of course
out of the scope for the moment.

Also today 87 MX hosts have non-matching TLSA records, 4 don't offer
STARTTLS despite published TLSA records, 2 have certificates with
non-matching names, and 2 more are expired.  So in the "problem
landscape" DNSSEC failures that break primarily TLSA records, while
the DNS appears to otherwise be working are not actually unusual.
One would expect that all TLS policy problems are comparatively
rare (and indeed at present they are), and the question is now
about frequency among problem destinations.

So in fact the qualifier that "these types of reports may be rare"
is unnecessary, speculative and does not match present reality.

>> We could perhaps add a "dane-required" failure mode for cases where DANE is 
>> mandatory (by mutual agreement perhaps), but then the receiving domain's 
>> TLSA records (really the records for the underlying MX hosts) are 
>> unexpectedly removed, or become "insecure" if DNSSEC is disabled (DS records 
>> removed from parent zone).
> 
> I'd be inclined to say that this would appear in a "v2" of this draft.

Drafts don't have a "v2".  I don't expect to see a "bis" revision
of the published RFC any time soon, the WG will soon be closed.
There is still time to make some final changes.  I apologize for
the belated "tlsrpt" feedback, I was too busy with STS and other
matters to give any feedback on "tlsrpt", so here it is, "late",
but not "never".

>> I think this is a reasonable argument for camels over kebabs, any thoughts 
>> on whether it is possible to revise the field names accordingly?
> 
> The draft has been like this since the inception, and I can't say I'm 
> inclined to change it.  If there's strong group consensus, we can do it.

Please do consider this.  I've lately been doing some Haskell
programming, and it supports implicit generation of JSON encoders
from structure type definitions, freeing the developer from having
to write any code to get JSON output, but with "kebab-case" direct
"deriving" of a JSON encoder is no longer possible and one would
have to explicitly define some encoding rules.  I don't think that
Haskell is unique in allowing camelCase identifiers and not
"kebab-case" identifiers.

>> Section 4.4:
>> 
>> I notice that:
>> 
>>     "failure-details": [
>>       {
>>         "result-type": result-type,
>>         "sending-mta-ip": ip-address,
>>         "receiving-mx-hostname": receiving-mx-hostname,
>>         "receiving-mx-helo": receiving-mx-helo,
>>         "failed-session-count": failed-session-count,
>>         "additional-information": additional-info-uri,
>>         "failure-reason-code": failure-reason-code
>>         }
>>       ]
>> 
>> includes only the "receiving-mx-hostname" and not its IP address.  Some 
>> sites have multiple hosts behind a single name, e.g. sometimes the IPv4 
>> address for mail.example.com lands on a different machine than the IPv6 
>> address for the same name (and indeed they present different certificate 
>> chains, ...).  Leaving out the address might make it more difficult to 
>> identify the problem receiving machine.  Perhaps the intention here is that 
>> receiving systems should make an effort to assign different 
>> "receiving-mx-helo" names in such to disambiguate these cases?  If so, 
>> perhaps that should be mentioned in the text.
> 
> We can add this, though, in the case of a front-end VIP, you're going to have 
> the same target IP in many stanzas.  I understand the rationale, just wonder 
> how useful it will be in the end.

Yes, the IP will not always be sufficient to disambiguate
all problems, but load-balanced MX-pools are more typical
of large providers that presumably have more operational
discipline and one hopes will rarely if ever mess up. On
the other hand, smaller SOHO domains do make mistakes from
time to time, and it is among these that I in fact sometimes
see TLS working for (say) IPv4 and not (say) IPv6 of the
"same" MX host, which turns out to be the "same" in name
only.


>> The format for DANE "policy-string" values is both under-specified, and 
>> needlessly complex:
>> 
>>      "policy-string": A string representation of the policy, whether
>>      TLSA record ([RFC6698] section 2.3) or MTA-STS policy.  Examples:
>>      TLSA: ""_25._tcp.mx.example.com.  IN TLSA ( 3 0 1 \
>>      1F850A337E6DB9C609C522D136A475638CC43E1ED424F8EEC8513D7 47D1D085D
>>      )""...
>> 
>> It should be more clear whether the "TLSA:" prefix is part of of the string 
>> (I believe it is not), and what the pair of consecutive double quotes is 
>> about, and how multiple TLSA records are represented.  The JSON policy 
>> includes "\n" separators (which JSON maps to actual newlines), are TLSA 
>> records to be separated with similar logical newlines?
>> 
>> More importantly, the form:
>> 
>>  _25._tcp.mx.example.com. IN TLSA ( 3 0 1 
>> 1F850A337E6DB9C609C522D136A475638CC43E1ED424F8EEC8513D7 47D1D085D)
>> 
>> carries much unnecessary syntax.  It should instead be a simple quintuple 
>> (qname, usage, selector, mtype, data):
>> 
>>   _25._tcp.mx.example.com. 3 0 1 
>> 1F850A337E6DB9C609C522D136A475638CC43E1ED424F8EEC8513D747D1D085D
>> 
>> with the implied "IN TLSA" and "(" ")" removed, and the hex data presented 
>> without any internal whitespace.  With that simplified it remains only to 
>> specify clearly how present a complete TLSA RRset.
> 
> The intent was to have the record and the result.  Your changes are 
> acceptable, and we can make the alterations.    In the case of multiple 
> record results, how would you like that to be displayed?  Separated with a 
> semicolon is sufficient I would think.  We'll firm up the examples.

The "(" and ")" are entirely optional, the TLSA record is well-formed
without them.  Since the policy type is TLSA, why repeat "IN TLSA",
it is redundant.  Internal optional white-space in the data payload
needlessly and significantly complicates parsing.  As for separators,
I would much prefer newline ("\n" just like in the STS policy blob).

This portion of the spec really should be more formal, it presently
just presents a free-form example, rather than a well-defined syntax.
What's the intent of the consecutive double-quote characters for
example?


>> Another nit:
>> 
>>   o  "domain": The Policy Domain is the domain against which the MTA-
>>      STS or DANE policy is defined.  In the case of Internationalized
>>      Domain Names ([RFC5891]), the domain is the Punycode-encoded
>>      A-label ([RFC3492]) and not the U-label.
>> 
>> A domain name is not a "label" it consists of labels separated by "."
>> characters.
> 
> I believe you're asking to replace "label" with "labels", and that's fine.

Yes, basically though the change requires a few more words to make the
grammar work.

>> Also how are "policy-domain" and "mx-host" to be specified in the DANE case? 
>>  DANE policy is per-MX, not per-domain, and mx-host seems redundant in both 
>> cases, since with STS the policy string surely contains the same data.  What 
>> goes in these fields when no policy is obtained, they don't appear to be 
>> optional...
> 
> Perhaps I'm misinterpreting the statement. The policy domain is the recipient 
> domain, while the mx-host is the MX that is attempted.

That's not what the draft text says.  Instead it has:

  o  "mx-host-pattern": The pattern of MX hostnames from the applied
      policy.  It is provided as a string, and is interpreted in the
      same manner as the "Checking of Wildcard Certificates" rules in
      Section 6.4.3 of [RFC6125].  In the case of Internationalized
      Domain Names ([RFC5891]), the domain is the Punycode-encoded
      A-label ([RFC3492]) and not the U-label.

So it sure seems to be the "mx" element of the policy, repeated for
some reason.

>  Those do not need to align, and multiple recipient domains can point to a 
> single MX.  I'm not quite sure which policy you're referring to in the last 
> portion of that paragraph.

The "policy-string" in the JSON report in Section 4 already contains
the same "mx-host-pattern" (called just "mx").


>> Why in Section 5.1 is there a "report filename"?  Surely the processing
>> and storage of reports is entirely up to the receiving system, which
>> should ignore filename hints from the sender.
> 
> This is the filename upon delivery. The receiver can rename the file to
> anything they'd like.

A JSON report is NOT a file.  It makes no sense to have to specify
a filename and just creates security issues when one is processed
by poorly thought out receiving software.  If the intended purpose
of the filename is to carry useful metadata (rather than suggest
a storage location), such metadata should be encoded via additional
MIME attributes of the media type.

>> Why is the Content-Encoding (gzip or not) specified via a filename and not 
>> the standard header used for this purpose?
> 
> It's specified as both the attached filename and in the Content-Type header.

I see this as a mistake.  There is only one media type involved
here, which may have a Content-Encoding.  A receiving system may
always decompress any compressed form as it comes in (or just
use uncompressed inputs as-is) parse the JSON and load the data
into a database.  Another receiving system may always compress
any uncompressed inputs and save the reports into a file store
for archival or onward delivery.  This is up to the receiving
system, but either way I see only the one media type here.

The issue is perhaps that while "HTTP" has "Content-Encoding",
MIME as defined in RFC-2045 does not:

    https://www.ucolick.org/~sla/fits/mime/inetstds.html

       MIME media types and the WWW
       ...
       Section 14.11 gives the proper means of describing the HTTP
       transmission of a file which is already compressed

       The Content-Encoding entity-header field is used as a modifier
       to the media-type. When present, its value indicates what
       additional content codings have been applied to the entity-body,
       and thus what decoding mechanisms must be applied in order to
       obtain the media-type referenced by the Content-Type header field.
       Content-Encoding is primarily used to allow a document to be
       compressed without losing the identity of its underlying media
       type. Unfortunately it seems reasonably clear that Content-Encoding
       cannot be applied as if it were a MIME "Optional parameter" with the
       expectation that it could be used when transferring compressed files
       via e-mail. This observation is based upon section 19.4.4. It appears
       that a complete description of compressed files transmitted via e-mail
       requires a new RFC enhancing the MIME standard.
    
    https://www.w3.org/Protocols/rfc2616/rfc2616-sec19.html#sec19.4.4

        19.4.4 Introduction of Content-Encoding

        RFC 2045 does not include any concept equivalent to HTTP/1.1's
        Content-Encoding header field. Since this acts as a modifier on
        the media type, proxies and gateways from HTTP to MIME-compliant
        protocols MUST either change the value of the Content-Type header
        field or decode the entity-body before forwarding the message.
        (Some experimental applications of Content-Type for Internet mail
        have used a media-type parameter of ";conversions=<content-coding>"
        to perform a function equivalent to Content-Encoding. However, this
        parameter is not part of RFC 2045.)

Thus, at the very least for HTTP, there should be just one media type,
and Content-Encoding should be used to signal compression.  For email,
since we're introducing a new "multipart/report" subtype, and "Content-Encoding"
is not a standard MIME header, an appropriate media-type parameter (what
I earlier referred to as MIME "attribute") should be defined for
"application/tlsrpt".  It could be the above "; conversions=gzip" or
something similar.  The "application/tlsrpt" being a new media type,
you're free to define appropriate additional metadata.

>> In section 5.3 what are these "%s" prefixes in front of literal quoted 
>> strings?  Is this a standard syntax?
>> 
>>   The [RFC5322].Subject field for report submissions SHOULD conform to
>>   the following ABNF:
>> 
>>       tlsrpt-subject = %s"Report" FWS               ; "Report"
>>                        %s"Domain:" FWS              ; "Domain:"
>>                domain-name FWS              ; per [RFC6376]
>>                %s"Submitter:" FWS           ; "Submitter:"
>>                domain-name FWS              ; per [RFC6376]
>>                %s"Report-ID:" FWS           ; "Report-ID:
>>                "<" id-left "@" id-right ">" ; per [RFC5322]
>>                [CFWS]                       ; per [RFC5322]
>>                                                     ; (as with FWS)
> 
> Yes, I believe Chris asked for that change.  I'd have to check list history.

A reference to a document that defines such syntax would
be helpful.  It just looked confusing to me...


>> In section 5.4, why a separate GZIP-specific media type and not just a 
>> Content-Encoding?  Surely the media type is just the JSON report format, and 
>> GZIP is an encoding layer...
> 
> I believe this was at the request of the WG Chairs.  I'd have to go back 
> through history to verify this.

Please do.  Keep in mind that absent a "Content-Encoding" header, one
should indeed specify that the content is compressed by means other
than a filename (which I argue should not be present at all).  However,
the solution as I see it is to employ the already defined for this
purpose "Content-Encoding" header (with HTTP, and a new media-type
parameter for MIME email) and avoid a separate media type for
compressed JSON reports.

>> I can't make head nor tail of the verbiage in 5.6.  Sounds like someone ran 
>> over a thesaurus while riding a lawn mower... :-)
> 
> If the filename/subject disagree with the report, then the report is the 
> authoritative source.

That's much simpler and clearer than the current text.  Of
course no "filename" should be used to convey *any* structured
information about the report, indeed a filename should be
entirely optional (unspecified in this draft).

Like the optional filename, the subject should be for human
consumption only.  A suggested subject does make it easier
for reports from multiple sources to stand out as such at a
glance, when the "rua" email address is for a human postmaster
and lands in a general-purpose (personal?) mailbox.

Solely as a *convenience* to a human postmaster, the only
structured data that might be appropriate to include as
part of an optinal filename would be a ".gz" or a ".txt.gz"
suffix to make it easier for postmasters to save reports
that land in their mailbox.  The rest of any filename
should be uninterpreted opaque data, and it could simply
be "tlsrpt.txt" or "tlsrpt.gz", but again for convenience
it *may* be friendlier to include a date:

        tlsrpt-20171206.txt
        tlsrpt-20171206.txt.gz

Whether or where the recipient saves such reports in some
filesystem is entirely at the recipient's discretion.

-- 
        Viktor.

_______________________________________________
Uta mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/uta

Re: [Uta] TLSRPT further comments

Reply via email to