On Thu, Dec 11, 2014 at 02:51:27PM -0500, Rose, Scott W. wrote:
> Realized the other action item I was assigned to from the interim
> meeting was email canonicalization for SMIMEA. I believe it stems
> from Viktor Dukhovni's email to the endymail list:
> http://www.ietf.org/mail-archive/web/endymail/current/msg00134.html
>
> I was wondering if we can borrow a page from RFC 4034 Section 6.2 and include
> text in the draft Section 3, item 1 in the numbered list:
>
> 1. The user name (the "left-hand side" of the email address, called
> the "local-part" in the mail message format definition [RFC2822]
> and the "local part" in the specification for internationalized
> email [RFC6530]), is hashed using the SHA2-224 [RFC5754]
> algorithm (with the hash being represented in its hexadecimal
> representation, to become the left-most label in the prepared
> domain name. This does not include the "@" character that
> separates the left and right sides of the email address. The
> string that is used for the local part is a Unicode string
> encoded in UTF-8 **with all upper case letters converted to their
> corresponding lower case letters where appropriate.**
>
> The text between the '**' is new. The goal is to prevent a situation when
> the email address is "[email protected]" and the SMIMEA is created using
> "jrandom" as the user name. Would this be enough, or are there scripts
> where this would result in different or potentially conflicting owner names?
This proposal is sadly simply wrong. There is no correct
(language-independent) canonicalization of Unicode to lower case.
Nor is it appropriate to down-case even ASCII localparts, because
these are by definition case-sensitive on the wire, with any
case-folding solely at the discretion of the destination system.
I have a proposal that solves the ASCII use-case. Sadly, little
can be done for non-ASCII Unicode, those names will just have to
be used consistently by all parties.
For all-ASCII addresses, (ignoring for the moment Turkish case-
folding of "I" to a non-ASCII "dotless" "i"), the proposal is
as follows:
* Clarification: Localparts that are not dot-atoms and
require quoting, retain the quotes when hashed, only
the @domain part of the address is removed, the rest
of the address is retained verbatim.
* Domains that publish user SMIMEA records, which intend for
for the names to be treated case insensitively, compute two
hashes for each name:
SHA2-224("Frank.Jr.") -> <base32-hash1>
SHA2-224(@lower:"frank.jr.") -> <base32-hash2>
The DNS records are then:
<base32-hash1>.example.com. IN SMIMEA ...
<base32-hash2>.example.com. IN CNAME <base32-hash1>
* Domains that don't do case-insensitive delivery publish only
the as-is form of each address without any "@lower:" prefix.
* Clients that encounter an ascii localpart that is not all lower-case
try both keys, first the localpart as-is, then case-folded with
the "@lower:" prefix.
--
Viktor.
_______________________________________________
dane mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/dane