--On Tuesday, 11 June, 2002 23:04 +0000 "Adam M. Costello" <[EMAIL PROTECTED]> wrote:
> John C Klensin <[EMAIL PROTECTED]> wrote: > >> (i) The specification now appears to say that applications can >> decide to use IDNA or not. Presumably, they can decide to use >> something else instead. > > Paul Hoffman / IMC <[EMAIL PROTECTED]> replied: > >> If either sentence is at all true, we need to fix it. I >> don't see any place that says that an application doesn't >> need to use IDNA for non-LDH domain names. > > You're both right. John must be referring to this sentence in > section 1: > > This document does not require any applications to conform > to IDNA, but applications can elect to use IDNA in order > to support IDN while maintaining interoperability with > existing infrastructure. yes > So applications can indeed decide not to use IDNA, in which > case they can't use non-ASCII characters in domain names. If > you want to use non-ASCII characters in domain names, IDNA is > your only option. IMO, that needs to be said, and said right there and very, very, explicitly. >> And I certainly don't see anything that would indicate that >> the second sentence is true. > > Agreed. We can't stop applications from using custom mappings > from non-ASCII text onto domain names, but the rest of the > world would never see the non-ASCII characters. The rest of > the world would simply see RFC 1035 domain names, which > contain only ASCII characters (and possibly octets 80..FF, > which are not defined to represent any characters at all). But we can say, clearly, that any such applications or activities are non-conforming to IETF standards. And I believe we should do so. > John wrote: > >> It is also not clear whether "an application", as used in the >> spec, refers to a standard protocol and expectations about how >> all conforming implementations will behave or to particular >> implementations and implementation choices. > > Both, I think. An individual application can unilaterally > decide to support IDNA, and a spec can opt to require IDNA > support. Ok. I think it should be more clear. And I need to think through the cases on this. >> But, if an application does a DNS lookup for an LDH name >> (with no prefix) with, say, an MX Qtype, and some of the >> result data are returned with IDNA-appearing prefixes, then >> the application needs additional clues as to how to present >> error messages to the user, how to continue processing in >> other areas, etc. > > I don't see why. If the application conforms to IDNA, it will > work. If it ignores IDNA and treats the name like any other > traditional ASCII name, it will work. True as long as it doesn't support a prefix-based interpretation, or a different prefix, with a different interpretation than IDNA. I took the text to imply that you were willing to tolerate such things. If you don't intend to, the text has to both make clear that they are non-conforming and, preferably, to damn them roundly and explain why. >> (ii) The specification seems inconsistent about what it about >> and who needs to understand it. It implies in several places >> that it is not about the DNS and has no impact on the DNS. > > Paul replied: > >> I can't find any place that makes it not about the DNS. > > Perhaps John is referring to this: > > IDNA does not require any changes to DNS servers, > resolvers, or protocol elements For starters, yes. I'll try to go back and find the rest of the text that set me off. Or have a look at some of Dave's "not related to the DNS, just 'micro layer'" comments of the last few days. > John wrote: > >> But it then makes and imposes normative DNS operational >> statements. E.g., "Non-ACE labels that begin with the ACE >> prefix will confuse users and SHOULD NOT be allowed in DNS >> zones" is certainly a requirement about DNS and DNS zone >> population. > > You are quite right. When we wrote "IDNA does not require any > changes to DNS servers", the word "require" was intended in > the sense of "depend on". I have already suggested changing > it to "depend on" (here and in similar places in the spec). That would be _much_ better, IMO. > Although IDNA does impose this new recommendation on DNS > servers, it would still work if it didn't. > >> (iii) Despite the "applications can choose to do this, or not >> do it" language, section 6.4 effectively implies a >> requirement that any application that uses the DNS be >> upgraded. > > Paul replied: > >> Every application of DNS should be upgraded for every >> extension or improvement of the DNS protocols. Nothing in >> 6.4 requires this, as far as I can see. > > I think John must be referring to this: > > All applications that might show the user a domain name > obtained from a domain name slot, such as from > gethostbyaddr or part of a mail header, SHOULD be updated > as soon as possible in order to prevent users from seeing > the ACE. yes > "SHOULD" is fairly strong, perhaps strong enough that this > paragraph might be considered inconsistent with this one: > > This document does not require any applications to conform > to IDNA, but applications can elect to use IDNA in order > to support IDN while maintaining interoperability with > existing infrastructure. yes > Maybe it would be less contentious if 6.4 merely stated the > benefits of upgrading applications, leaving the reader > conclude that doing so is a good idea. Personally, I would actually prefer that this whole IDN package come with a plan about how we are going to deprecate the non-conforming. The trick we tried to use with ESMTP was to invent the notion of a "contemporary implementation" and then talk about what "contemporary" (or "modern", or...) did and how they behaved. "Legacy" (or "older", or ...) implementations could still be said to conform to the old standards, but the strong implication was that they were not up-to-date if they weren't prepared to deal with the new stuff. I think something like that might help here although, regardless of my other points and issues, this whole situation feels more and more as if we ought to try to get an AS out that addresses relationships among protocols, expectations, and quality of implementation issues. I wouldn't mind just stuffing all of that into IDNA, but it probably isn't the right thing to do (or fair to you guys). >> (iv) Section 7 imposes several normative requirements on name >> servers and zone populations. Again, these requirements are >> buried in a document that elsewhere appears to claim that it >> doesn't impact the DNS. > > First requirement: > > Internationalized domain name data in zone files (as > specified by section 5 of RFC 1035) MUST be processed with > ToASCII before it is entered in the zone files. > > This follows directly from section 3 requirement 2. It is not > a special requirement on DNS, it is a general requirement on > any application that elects to use IDNs. Section 7 is simply > telling how it applies to DNS. It's still true that IDNA is > optional; if you don't want to support it, then don't, but > then you have no way to enter non-ASCII names into zones. This is, to put it mildly, not clear. Or not clear enough. At least to someone who hasn't been immersed in the document. > Second requirement: > > a primary master name server MUST NOT contain an > ACE-encoded label that decodes to an ASCII label. > > This is a vacuous requirement. There is no such thing as an > ACE label that decodes to an ASCII label, because of the > design of the ToASCII operation. Therefore, this is not > really requiring anything. I think I once asked for it's > removal, but since it's harmless, I didn't press it. Recommendation: Leave it there, since it is harmless and may be helpful to someone. But rephrase it, not as a "MUST" (or any other form of requirement), but as an observation. E.g., "note that, since the design of the ToASCII operation prevents any ACE label from decoding to an ASCII label (i.e., one without any non-ASCII characters), a primary master name server will never...". Or something like that. > Third requirement: > >> The third paragraph of section 7 seems to contradict or >> repeal the statements of RFC 2181 that non-ASCII strings are >> permitted in labels > > Good catch. We'll have to do something about that. Yes. Or Randy and kre will kill you. :-) >> (v) In particular, several of the statements in the draft go >> beyond almost anything we have about the DNS in trying to >> narrowly constrain the behavior of RR labels and data that >> are not yet defined, even in Classes that are not yet defined. > > You found one such statement, please feel free (and > encouraged) to enumerate the others. :) I'll go back and look. I think other comments above identify places where I simply reached different conclusions about what you meant than you intended. I think those things are worth clarifying -- others won't read the thing even as carefully as I have, must less be as familiar with it as you are. But I'd still prefer to see either enumerations of where IDNA can be used, or some much more clear language about its applicability. That gets back to the "applications" issue and the very broad "all labels" and its implications. See below. >> IDNA, by specifying nameprep profiling as part of the >> procedure, violates this principle by effectively requiring a >> one-off protocol for situations in which nameprep is >> inappropriate but other IDNA steps would be reasonable. > > Currently, all domain labels are compared the same way. This is exactly where I get hung up. Leaving the more subtle issue that Eric is trying to explore and some of its close relatives aside for the moment, let's assume that it is rational to apply IDNA to all domain labels and associated parameters are all _current_ RRs. That may be completely reasonable. But suppose that, at some point, and just as an example, some proposal goes forward to use a different Class for some purpose, or to use the DNS for a different purpose entirely. Such transitions involve exactly the sorts of infrastructure pain and suffering that IDNA was designed to avoid, but it is legitimate for the community to conclude that they are necessary. If we do decide to accept that pain and suffering, then it would be good, IMO, to be able to use it as an opportunity to clean out, as much as possible, things that were done only to preserve compatibility. I'd like to give those future designers the opportunity to make choices about whether to use IDNA (or ACE generally) to handle internationalization, the ability to think through whether more of the normalization or matching processes should be moved to the server, rather than being handled exclusively in the client, etc. I don't want (or need) to predict what they will conclude, but I think it is unnecessary and dangerous for us, at this stage, to write what seems to be an "all internationalization uses IDNA, now and forever, including in RRs and Classes and uses of the DNS we have no way to anticipate" rule. So I think we all need to understand the points Eric is trying to make. But, even if they are totally irrelevant or we conclude that they aren't worth the trouble, I believe that we should confine IDNA application and interpretation to what we can reasonable claim we know about today and that we should _explicitly_ leave future use and applicability for decisions to be made in the future. > Allowing multiple profiles amounts to allowing multiple > comparison functions. Eric Hall disputes this, but if you > stumble across two labels, one containing uppercase Latin > letter O with umlaut, the other containing lowercase latin > letter o followed by combining umlaut, and you wonder whether > they match, what is the answer? Yes, no, or maybe? For ASCII > domain names, there is no maybe. We deliberately designed > IDNA so that the same is true for IDNs. Understood, but not the point. What follows is going to be long, and I apologize, but we have proof that this isn't easy to explain. I strongly suspect that the problems/ issues that are driving Eric are different from mine, so please don't take this as representing (or preempting) his concerns and cases. At the risk of exposing another philosophical problem, let me repeat (or paraphrase) what I said in Minneapolis: for me, the essence of a good engineering job is to understand _all_ of the constraints that define the solution space for a problem and then either find a solution that fits within that solution space or be very clear to identify solutions that may not quite fit and it issues that make up the tradeoffs. Conversely, I consider that saying "we will define the problem very narrowly so that we can solve it and then leave the mess that creates for the rest of the world to work out" is lousy engineering and worst standardization practice. Now, that is just my opinion, but it is the foundation for an explanation, so bear with me. My guess is that the whole notion of "the registries will have to figure this out" is going to fail miserably. I don't think that is a valid argument for not publishing IDNA (or any of the rest of these IDN WG things) -- I am speculating, and this is the reason we have proposed standards and multiple maturity levels. But, to the extent to which LDH has advantages, those advantages have "worked" because they have been enforced by code in applications, not just by registries. If we say "open season, anything can go in, but some names aren't going to resolve because the registries will prohibit them" we end up in a much weaker state. We also end up with increased load on servers and, unless the state of negative caching improves a lot, probably disproportionately more load on the root. So, I contend there are big advantages to being able to sanity-check names before going off to the DNS and that the advantages get larger in the very complex world in which nearly-arbitrary Unicode strings are permitted as names. Now, suppose someone comes along and invents a new RR type. And, in addition to learning from our trying to over-rely on registres, the nature of that RR type is that, while some internationalization is appropriate for its labels, performance --for both successful and unsuccessful queries-- is a really critical issue, even relative to DNS norms. And let's suppose that, after appropriate analysis, the conclusion is that the characters that should be permitted in labels for that RR type should be slightly more than the 63 (or 64) of LDH, but many fewer than "most of Unicode"... say a few hundreds or, at most, a few thousand, of select characters. And, whatever characters are chosen, they are unambiguous, for both technical and user-perception definitions of ambiguity. Now, to accomplish that goal (please let's not get started on whether it is reasonable or not -- this is both future and hypothetical and I don't know what would cause such an RR to be created), the best path, given our current framework, would be to modify stringprep to add a table of the characters permitted by that RR. _Then_ the right thing to do is to create a profile that uses that table instead of some of the maps and prohibitions of nameprep. I am suggesting that (i) it should still be possible to use IDNA, but with a different profile and (ii) we should not, as a side effect of the way we specify IDNA today, prevent that RR from being defined, regardless of whether it can use IDNA+different profile or whether it needs to use an entirely different protocol to set up, compare, and map names. john
