"Dan Oscarsson" <[EMAIL PROTECTED]> wrote:
> >The basic idea here is to declare formal data-types for labels, and to > >incorporate the data-types into syntaxes for applications and protocols to > >use when they need to interact with domain names. > > While this is good, to make DNS really work the foundamental > rules should be the same for all labels. Just like it has been > so far. No, DNS has two rules: octet strings with ASCII, and a hostname subset. > While Eric allows binary in DNS labels making things very complex, > I think we should go for how DNS is used and teher are at least > one RFC defining text and binary labels (binary label is defined by > EDNS). Binary EDNS is an application-specific representation of a sequence. It does not define methods for storing or representing binary values as domain names in application or protocol data-streams. The current discussion is talking about the unencoded (raw) domain names, which may be encoded in DNS or applications in as-yet-undecided forms. Binary EDNS labels are unrelated to the current discussion, and are not appropriate as a single method of dealing with these domain names (the label data-types still have to be defined for the applications to use them consistently, regardless of which encoding is used to represent them by any specific application). > The standard STD13 DNS label, and any new long label are TEXT labels. > They may only contain printable characters. No. TXT and other RRs may have any octet as an owner domain name. DNS is not limited to identifying hosts. > - They must be normalised. No. TXT should have the same basic capabilities, which is to specify an exact sequence of character codes. > As Eric said: > - Minimum length of one UCS character code. > - Maximum length of 63 UCS character codes. > > - Maximum cumulative length of 255 UCS character codes in a domain name. > > Then Eric goes into different types of labels: host name, ascii, > mailbox and srv. > While an application can have special rules, DNS cannot. Note that the ASCII label data-type is specifically provided to support SRV. I had thought about naming it SRV and restricting it to LDH with leading underscore but thought that a generic printable-ASCII would be more useful. This can be changed. > For all labels in DNS (including host name and mailbox): > - They must be case-insensitivly matched. > - They must retain original form (not converted to lower case) > in DNS. These are different data-types with different considerations. Mailbox names must be case-preserved in order to satisfy protocol dependencies, and are not used in lookups so normalization is not required. Host names have no such dependencies. STD13 defines domain names as case-neutral, and enforces this through case-neutral comparison operations on the servers. However, making host names case-neutral will require a new ACE, or will require that every possible case combination be delegated and managed simultaneously. As such, case-neutral comparisons can still be performed, but should be done at the resolver instead of at the server. > Having the above set in place, applications can apply additional > rules and they can change over time without the basic DNS > workings having to be changed. > The above rules gives DNS a simple clear foundation to stand on. > > If you need a binary label, define one. > > -- > For applications we can define the additional rules Eric has > specified. But some of them I can see no reason to have: > > Host names: > - Must be allowed to have mixed case. DNS must be allowed > to return host names containg upper case letters. > Otherwise software will break. Making host names case-neutral will require a new ACE, or will require that every possible case combination be delegated and managed simultaneously if legacy systems are to provide lookup functions against the encoded names. > - I can see no reason not to allow 1 character in minimum length. > Labels today have one character lengths. The current delegation rules prohibit it. STD13 hostnames are currently a subset of IDN and that should be preserved where possible. Essentially, this change would mean ~"if all of the characters in the delegation are LDH, then the minimum length is 2 characters, otherwise it is one character." > Mailbox labels: > - In DNS they are not case-sensitive. Some mail systems are said > to have them case-sensitive. Are there still some such ancient > systems left? > DNS must compare them case-insensitive, but return them > retaining original case. There is no comparison on mailbox names. Mailbox names are not specified in any queries. For RR data, they must be provided in case-sensitive form, but by the same token they must also be non-normalized until the successor to 2822 says which normalization to use. > What characters should be allowed in a label? > Above I have defined it to be printable characters. > Looking at how names are used, I would like to restrict > it further. A name is often used as part of a text (for example > in a manual or a web page). You then do not want the name > to affect the formatting of the text. So you cannot allow > anything in a name that affects direction, width, size, boldness, etc. > So things like double width characters should not be allowed. > This should probably be included in the definition of > what is normalised text. Things like upper/lower case do not > change the formatting of the text and can be used to enhance > meaning or readability, and should be retained. I am neutral on these issues. If there is a desire to exclude double-width characters, then I will add it.
