On Wed, Mar 11, 2009 at 11:44:54PM +0800, James Seng wrote: > > > <label> ::= <letter> [ [ <ldh-str> ] <let-dig> ] > > ... > > <letter> ::= any one of the 52 alphabetic characters A through Z in > upper case and a through z in lower case
Selective quoting can prove anything. Immediately prior to that section, RFC 1035 says The following syntax will result in fewer problems with many applications that use domain names (e.g., mail, TELNET). > a) is highly debatable because it is not an explicit requirement since > it is mention in a section called "DISCUSSION" in a passing that > "since at least the highest-level component label will be alphabetic", > in the context that TLD is alphabetic only as a matter of fact at that > time, not as a matter of technical requirement I just responded to that exact argument up-thread, but since that wasn't apparently convincing, let's do it in more detail. The beginning of 2.1 relaxes a requirement of RFC 952 that host names may never start with a digit. 1123 says that host software MUST support the more liberal syntax. Moreover, the host SHOULD check a candidate string "syntacitcally for dotted-decimal number before looking it up in the Domain Name System." As Mark Andrews has argued elsewhere on this list, the single label "666" could be interpreted as an IP address. Various hex representations may also be interpreted as an IP address. These may therefore pass the check for being a dotted-decimal number. The DISCUSSION portion of 2.1 is explaining why relaxing RFC 952's restriction is safe. The safety flows exclusively from the premise that the highest-level component label of a domain name "will be alphabetic"; this guarantees that a syntactic check for an IP address will fail due to at least one label being made up only of letters. It may be, therefore, that the "alphabetic restriction" is in fact policy, and is not strictly a protocol issue. The problem is that it is policy on which other technical decisions rest. Change the policy, and the justification for those other technical decisions is undermined. In this sense, the claim in the DISCUSSION portion of 2.1 is not just a policy: it is also the foundation of other protocol issues, and is therefore normative on the protocol even if it _is_ a policy matter. Finally, it is well-known that there are many implementations of software -- particularly with respect to the DNS -- where people with a less-than-nuanced reading of various RFCs have based what they will allow on that reading of the RFC. The "7 bit DNS" implementations are an excellent example of this: RFC 1035 was clear that the DNS itself allowed other characters, but implementations checked for the "preferred syntax" anyway because that was the safest bet. We know empirically that there were lots of checks (and in some cases still are) for "valid" TLD labels that looked for things no longer than three letters. The 2001 introduction of a number of new TLDs was rockier than necessary partly because of those checks, even though there was never an RFC that suggested such was a good check. 1123 _does_ suggest that it is reasonable to check for top-level labels being alphabetic, and I'd bet a pretty good lunch that we can find implementations that decide whether something is a "domain name" based on whether the top label starts with a letter. Therefore, even if we don't think that 1123 does in fact restrict the top-level label to letters only, it is prudent to treat such a restriction as a _de facto_ part of the protocol. To the extent we want to change that de facto part of the protocol, we want to do as little damage as possible. An argument in favour of John Klensin's suggestion to make an explicit exception for IDNA2008 A-labels is that it is the smallest change that can be made that still accommodates the new feature we want. Best regards, Andrew -- Andrew Sullivan a...@shinkuro.com Shinkuro, Inc. _______________________________________________ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop