Re: [dane] email canonicalization for SMIMEA owner names

Viktor Dukhovni Thu, 11 Dec 2014 12:51:31 -0800

On Thu, Dec 11, 2014 at 02:51:27PM -0500, Rose, Scott W. wrote:

> Realized the other action item I was assigned to from the interim
> meeting was email canonicalization for SMIMEA.  I believe it stems
> from Viktor Dukhovni's email to the endymail list:
> http://www.ietf.org/mail-archive/web/endymail/current/msg00134.html
> 
> I was wondering if we can borrow a page from RFC 4034 Section 6.2 and include 
> text in the draft Section 3, item 1 in the numbered list:
> 
>      1.   The user name (the "left-hand side" of the email address, called
>        the "local-part" in the mail message format definition [RFC2822]
>        and the "local part" in the specification for internationalized
>        email [RFC6530]), is hashed using the SHA2-224 [RFC5754]
>        algorithm (with the hash being represented in its hexadecimal
>        representation, to become the left-most label in the prepared
>        domain name.  This does not include the "@" character that
>        separates the left and right sides of the email address.  The
>        string that is used for the local part is a Unicode string
>        encoded in UTF-8 **with all upper case letters converted to their
>        corresponding lower case letters where appropriate.**
> 
> The text between the '**' is new.  The goal is to prevent a situation when 
> the email address is "[email protected]" and the SMIMEA is created using 
> "jrandom" as the user name.   Would this be enough, or are there scripts 
> where this would result in different or potentially conflicting owner names?


This proposal is sadly simply wrong.  There is no correct
(language-independent) canonicalization of Unicode to lower case.

Nor is it appropriate to down-case even ASCII localparts, because
these are by definition case-sensitive on the wire, with any
case-folding solely at the discretion of the destination system.

I have a proposal that solves the ASCII use-case.  Sadly, little
can be done for non-ASCII Unicode, those names will just have to
be used consistently by all parties.

For all-ASCII addresses, (ignoring for the moment Turkish case-
folding of "I" to a non-ASCII "dotless" "i"), the proposal is
as follows:

    * Clarification: Localparts that are not dot-atoms and
      require quoting, retain the quotes when hashed, only
      the @domain part of the address is removed, the rest
      of the address is retained verbatim.

    * Domains that publish user SMIMEA records, which intend for
      for the names to be treated case insensitively, compute two
      hashes for each name:

            SHA2-224("Frank.Jr.")            -> <base32-hash1>
            SHA2-224(@lower:"frank.jr.") -> <base32-hash2>

      The DNS records are then: 

        <base32-hash1>.example.com. IN SMIMEA ...
        <base32-hash2>.example.com. IN CNAME <base32-hash1>

    * Domains that don't do case-insensitive delivery publish only
      the as-is form of each address without any "@lower:" prefix.

    * Clients that encounter an ascii localpart that is not all lower-case
      try both keys, first the localpart as-is, then case-folded with
      the "@lower:" prefix.  
      
-- 
        Viktor.

_______________________________________________
dane mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/dane

Re: [dane] email canonicalization for SMIMEA owner names

Reply via email to