John C Klensin <[EMAIL PROTECTED]> wrote: > Instead, what is needed is one very clear paragraph in the IDNA > document (I think). That paragraph should say that IDNA is to be used > (with nameprep, etc.) for the representation of non-ASCII domain names > in labels associated with RRs of type <list 1> in Class=IN, that it > MUST NOT (?) be used with RRs of type <list 2> in Class=IN, and that > it SHOULD NOT be used with RRs of type <list 3> in Class=IN until and > unless a standards-track specification is produced that specifies > otherwise. It should say similar things about data fields (for MX, > NS, CNAME, etc (?)) using similar lists.
But IDNA has to work with more than just DNS, it has to work with the wide variety of other protocols that carry domain names. By logical extension, your proposed paragraph needs to list every field/argument of every protocol/interface where IDNA may/should-not/must-not be used. Isn't that too much trouble (or even impossible)? Isn't it simpler to design IDNA so that it can safely be used for any (textual) domain label anywhere? That was our intention. "Eric A. Hall" <[EMAIL PROTECTED]> wrote: > Resolvers, middle-boxes and replication masters all need to be able > to convert between EDNS and ACE as part of the fallback process. > Distributing the profile-specific prefix to every point where > conversion might occur is a massive problem. I was suggesting that conversion between ASCII and non-ASCII never be done inside the infrastructure except possibly when it uses the well-known standard profile; for application-specific profiles, I was suggesting that conversion be done only at the edges. This model avoids profile-agnostic conversion; only entities that know the proper profile perform the conversion, which simplifies the security analysis. Your model is based on profile-agnostic conversion happening inside the infrastructure. Let's examine how that would benefit applications. Applications interact with the infrastructure in basically two ways: sending strings into the infrastructure, and receiving strings from the infrastructure. When sending strings that use an application-specific profile into the infrastructure, the application must perform Stringprep itself, because the infrastructure needs to compare the string but doesn't know the profile. Performing Punycode and prepending the prefix is not any extra effort for the application programmer; whether the program calls Stringprep(profile) or ToASCII(profile,prefix), it's one function call either way. So there's no benefit in this case. Now let's consider applications receiving strings that use an application-specific profile. If the infrastructure cannot do profile-agnostic conversion, then the application might receive an ACE, whereas if the infrastructure can do profile-agnostic conversion, then it can ensure that the application never receives an ACE. Whether this is a benefit depends on what the application does with the string. If it compares the string, then it needs to call Stringprep at least, and calling ToASCII is no more trouble, so there's no benefit. If the application passes the string along to a non-human, ACEs are not a problem, so there's no benefit. If the application displays the string to a user, then the ACE will need decoding, whereas a non-ACE wouldn't. That's the one case I can think of where applications could benefit from profile-agnostic conversion inside the infrastructure. Now let's consider the cost of your model. Profile-agnostic conversions and comparisons can return wrong answers if the inputs are not prepared using the proper profile (whether by accident or by malice). There is nothing in the label itself to indicate the proper profile; you want to use the same prefix regardless of which profile is needed. The entities that depend on correct conversions and comparisons need to know the proper profile, and you are assuming that will be implied by context, like the DNS RR type. But IDNA is useless if it only works for DNS, it also needs to work for mail headers and SMTP commands and URIs and SSL certificates and so on. So before IDNA could be used securely in a given protocol/interface/etc, one would need to wait for the proper profile to be specified for that particular protocol/interface/etc. I'm sure that would be fine with you Eric, but it would defeat one of the main design goals of IDNA, which is to allow applications to start using it without waiting for standards to be updated. I think it's simpler to have a single conversion and comparison rule for all domain labels everywhere. In the few cases where applications need to coerce other data types into domain names while preserving mixed-case or non-normalized strings, they can define their own mapping function (which does not use the IDNA prefix), and they'll just have to do the conversion themselves with no help from the infrastructure. > we are left with the application always performing conversion. > That design blows up the architectural benefits from having the > middle-boxes do it (in particular, having the caches learn the data so > they can cache it). I don't understand this. If the conversion to/from ACE is done only in applications, then the strings on the wire are all ASCII, and caches handle them just as they always have. > <fooprep> FOO <barprep> > > Nobody has yet told me why this won't work. I'm very leary of allowing <fooprep> on the left. If <fooprep> does not include case-folding, then two queries for names that differ only in upper/lower case could return different data. That's a fundamental departure from the current model. As for <barprep> on the right, the same argument could be made for reverse queries, though I admit that reverse queries are hardly ever used. My main concerns were give above. Would it work? Maybe it would. But is it a good idea? It still seems to me that allowing multiple profiles for domain labels is more complication than it's worth. > AMC and paf seem to be intimating that the IDNA labels may not be > reversible, which may require an additional fixup if so, and if it is > possible. Punycode is reversible. It can encode any sequence of integers, and the input sequence can always be recovered exactly. Nameprep is deliberately non-reversible. It does normalization and case-folding. ToASCII is non-reversible only because it calls Nameprep. If you were to define Eric's-ToASCII, which does not call Nameprep, then it would be reversible. AMC
