RE: Single-letter names (was: Re: Update of RFC 2606 based on the recent ICANN changes?)

2008-07-07 Thread Edmon Chung
Regarding single Unicode code-point labels at the TLD level, there was quite
some discussion on this topic at the GNSO Reserved Names working group and
then at the new gTLD discussion.  The final recommendation from the GNSO
was:

"Single and two-character U-labels on the top level and second level of a
domain name should not be restricted in general. At the top level, requested
strings should be analyzed on a case-by-case basis in the new gTLD process
depending on the script and language used in order to determine whether the
string should be granted for allocation in the DNS. Single and two character
labels at the second level and the third level if applicable should be
available for registration, provided they are consistent with the IDN
Guidelines."

As for ASCII, the recommendation was:
"We recommend reservation of single letters at the top level based on
technical questions raised. If sufficient research at a later date
demonstrates that the technical issues and concerns are addressed, the topic
of releasing reservation status can be reconsidered."

Edmon

> -Original Message-
> From: [EMAIL PROTECTED] [mailto:idna-update-
> [EMAIL PROTECTED] On Behalf Of Vint Cerf
> Sent: Saturday, July 05, 2008 3:33 AM
> To: John C Klensin
> Cc: James Seng; [EMAIL PROTECTED]; ietf@ietf.org; Lyman Chapin
> Subject: Re: Single-letter names (was: Re: Update of RFC 2606 based on the
> recent ICANN changes?)
> 
> john,
> 
> my reaction was specific to IDN single character TLDs. In some
> languages these are complete words.
> 
> vint
> 
> 
> On Jul 4, 2008, at 1:50 PM, John C Klensin wrote:
> 
> > Vint,
> >
> > In the ASCII space, there have been three explanations offered
> > historically for the one-character prohibition on top and
> > second-level domains.   I've written variations on this note
> > several times, so will just try to summarize here.  Of the
> > three, the first of these is at best of only historical interest
> > and may be apocryphal and the second is almost certainly no
> > longer relevant.  The third remains significant.
> >
> > (1) Jon has been quoted as suggesting that we could have
> > eliminated many of the problems we now face with TLDs and
> > simultaneously made the "no real semantics in TLD names" rule
> > much more clear had we initially allocated "b".."y" as TLDs.
> > Then, when someone asked for an assignment, it would have been
> > allocated at random to one of those domains.  While this has a
> > certain amount of appeal, at least in retrospect, there is
> > probably no way to get from where we are today to that model...
> > unless actions taken in the near future so ruin the current DNS
> > tree as a locus for stable and predictable references that we
> > need to start over with a new tree.  I don't think that a "have
> > to start over" scenario is at all likely, but I no long believe
> > it to be impossible.
> >
> > (2) There was an idea floating around for a while that, if some
> > of the popular TLDs "filled up", one could create single-letter
> > subdomains and push subsequent registrations down the tree a
> > bit.  For example, if .COM were declared "full", then "a.com",
> > "b.com", etc., would be allocated and additional reservations
> > pushed into subdomains of those intermediate domains rather than
> > being registered at the second level.  Until and unless the
> > conventional wisdom that adding more names to .COM merely
> > requires more hardware  and/or bandwidth, that won't be a
> > "filled up" point at which this sort of strategy could be
> > triggered.  Worse, trying to use single-letter subdomains as an
> > expansion mechanism would raise political issues about putting
> > latecomers at an advantage that would be, IMO, sufficient to
> > completely kill the idea.  In the current climate, I think the
> > community would decide that it preferred a disfunctional DNS if
> > that were ever the choice (see the "start over" remark above).
> >
> > (3) At least in the discussions that led up to RFC 1591, and
> > probably much earlier, there were concerns about reducing the
> > likelihood of false hits if the end user made single-character
> > typing errors.  With only 26 (or maybe 36) possible characters,
> > it could just about be guaranteed that all of them would be
> > registered and that _any_ typing error would yield a false
> > match.  That, in itself, has been considered sufficient to
> > prohibit single-letter labels and, by extension, to be fairly
> > careful about two-letter one

Re: Single-letter names (was: Re: Update of RFC 2606 based on the recent ICANN changes?)

2008-07-07 Thread Vint Cerf

john,

my reaction was specific to IDN single character TLDs. In some  
languages these are complete words.


vint


On Jul 4, 2008, at 1:50 PM, John C Klensin wrote:


Vint,

In the ASCII space, there have been three explanations offered
historically for the one-character prohibition on top and
second-level domains.   I've written variations on this note
several times, so will just try to summarize here.  Of the
three, the first of these is at best of only historical interest
and may be apocryphal and the second is almost certainly no
longer relevant.  The third remains significant.

(1) Jon has been quoted as suggesting that we could have
eliminated many of the problems we now face with TLDs and
simultaneously made the "no real semantics in TLD names" rule
much more clear had we initially allocated "b".."y" as TLDs.
Then, when someone asked for an assignment, it would have been
allocated at random to one of those domains.  While this has a
certain amount of appeal, at least in retrospect, there is
probably no way to get from where we are today to that model...
unless actions taken in the near future so ruin the current DNS
tree as a locus for stable and predictable references that we
need to start over with a new tree.  I don't think that a "have
to start over" scenario is at all likely, but I no long believe
it to be impossible.

(2) There was an idea floating around for a while that, if some
of the popular TLDs "filled up", one could create single-letter
subdomains and push subsequent registrations down the tree a
bit.  For example, if .COM were declared "full", then "a.com",
"b.com", etc., would be allocated and additional reservations
pushed into subdomains of those intermediate domains rather than
being registered at the second level.  Until and unless the
conventional wisdom that adding more names to .COM merely
requires more hardware  and/or bandwidth, that won't be a
"filled up" point at which this sort of strategy could be
triggered.  Worse, trying to use single-letter subdomains as an
expansion mechanism would raise political issues about putting
latecomers at an advantage that would be, IMO, sufficient to
completely kill the idea.  In the current climate, I think the
community would decide that it preferred a disfunctional DNS if
that were ever the choice (see the "start over" remark above).

(3) At least in the discussions that led up to RFC 1591, and
probably much earlier, there were concerns about reducing the
likelihood of false hits if the end user made single-character
typing errors.  With only 26 (or maybe 36) possible characters,
it could just about be guaranteed that all of them would be
registered and that _any_ typing error would yield a false
match.  That, in itself, has been considered sufficient to
prohibit single-letter labels and, by extension, to be fairly
careful about two-letter ones.   There have been arguments on
and off over the years as to whether this is a "technical"
reason or an attempt to set policy.  Even though the mismatches
would obviously not cause the network to explode or IP to stop
working, at least some of us consider the informational
retrieval and information theoretic reasons to insist on more
information in domain name labels in order to lower the risk of
false positive matches to be fully as "technical" as something
that would have obvious lower-level network consequences.
Others --frankly especially those who see commercial advantage
in getting single-letter names-- have argued that this position
is just a policy decision in disguise.

Note that, with slight modifications, the second and third
arguments apply equally well to TLD allocations and to SLD
allocations, especially in popular domains.

The reasoning associated with the third case also applies to any
other script that contains a fairly small number of characters.
One could manage a long philosophical discussion as to whether
there are sufficient characters in the fully-decorated
Latin-derived collection to eliminate the problem, but an
analysis of keyboard and typing techniques/ input methods for
that range of characters would, IMO, yield the same answer --
single-letter domains are just not a good idea and two-letter
ones near the top of the tree should be used only with great
caution.

On the other hand, the same reasoning would break down when
confronted with a script that contains thousands of characters,
such as the "ideographic" ones.  There are enough characters
available in those scripts that one can presumably not worry
about single-character typing errors (and one can perhaps worry
even less if the usual input methods involve typing
phonetically, using a different script, and then selecting the
relevant characters from a menu -- in those cases, the phonetic
representations are typically more than a character or two long
and the menu selection provides an extra check about false
matches).

 john



--On Thursday, 03 July, 2008 19:04 -0400 Vint Cerf
<[EMAIL PROTECTED]> wrote:


se

Re: Single-letter names (was: Re: Update of RFC 2606 based on the recent ICANN changes?)

2008-07-07 Thread William Tan
John,

To add to your point, one should also consider the question of
embedded semantics in a single-character label.

Alphabetic scripts such as Latin mostly represent sounds used to make
up words. While one can certainly find some legitimate
single-character words (such as the article "a" or the personal
pronoun "i") and dream up others, it would not be very convincing in
the face of your explanation #3.

On the other hand, characters in ideographic scripts such as Han are
not mere sounds or glyphs; they represent one or more concepts.
Therefore, a single-ideographic-character TLD label is certainly more
justifiable. I would even go as far as to suggest that it is essential
in many cases. For example, "猫" (U+ 732B) in both Simplified Chinese
and Japanese means "cat" as in English, not the abbreviation for
Catalan nor the UNIX command. The reverse translation of "cat" yields
the exact character in Simplified Chinese, though in Japanese
sometimes the Hiragana sequence "ねこ" is also used. One would be
hard-pressed to come up with a different character to represent the
same concept in Han, aside from the traditional Chinese counterpart
"��" (U+8C93).

I don't know what the present thinking is on the idea of non-semantic
TLDs, but IMHO the social expectations of DNS usage is cast in stone.
Jon's idea would simply shift the semantics to the second level,
thereby creating 24 roots instead of a single "."

=wil

On Fri, Jul 4, 2008 at 1:50 PM, John C Klensin <[EMAIL PROTECTED]> wrote:
> Vint,
>
> In the ASCII space, there have been three explanations offered
> historically for the one-character prohibition on top and
> second-level domains.   I've written variations on this note
> several times, so will just try to summarize here.  Of the
> three, the first of these is at best of only historical interest
> and may be apocryphal and the second is almost certainly no
> longer relevant.  The third remains significant.
>
> (1) Jon has been quoted as suggesting that we could have
> eliminated many of the problems we now face with TLDs and
> simultaneously made the "no real semantics in TLD names" rule
> much more clear had we initially allocated "b".."y" as TLDs.
> Then, when someone asked for an assignment, it would have been
> allocated at random to one of those domains.  While this has a
> certain amount of appeal, at least in retrospect, there is
> probably no way to get from where we are today to that model...
> unless actions taken in the near future so ruin the current DNS
> tree as a locus for stable and predictable references that we
> need to start over with a new tree.  I don't think that a "have
> to start over" scenario is at all likely, but I no long believe
> it to be impossible.
>
> (2) There was an idea floating around for a while that, if some
> of the popular TLDs "filled up", one could create single-letter
> subdomains and push subsequent registrations down the tree a
> bit.  For example, if .COM were declared "full", then "a.com",
> "b.com", etc., would be allocated and additional reservations
> pushed into subdomains of those intermediate domains rather than
> being registered at the second level.  Until and unless the
> conventional wisdom that adding more names to .COM merely
> requires more hardware  and/or bandwidth, that won't be a
> "filled up" point at which this sort of strategy could be
> triggered.  Worse, trying to use single-letter subdomains as an
> expansion mechanism would raise political issues about putting
> latecomers at an advantage that would be, IMO, sufficient to
> completely kill the idea.  In the current climate, I think the
> community would decide that it preferred a disfunctional DNS if
> that were ever the choice (see the "start over" remark above).
>
> (3) At least in the discussions that led up to RFC 1591, and
> probably much earlier, there were concerns about reducing the
> likelihood of false hits if the end user made single-character
> typing errors.  With only 26 (or maybe 36) possible characters,
> it could just about be guaranteed that all of them would be
> registered and that _any_ typing error would yield a false
> match.  That, in itself, has been considered sufficient to
> prohibit single-letter labels and, by extension, to be fairly
> careful about two-letter ones.   There have been arguments on
> and off over the years as to whether this is a "technical"
> reason or an attempt to set policy.  Even though the mismatches
> would obviously not cause the network to explode or IP to stop
> working, at least some of us consider the informational
> retrieval and information theoretic reasons to insist on more
> information in domain name labels in order to lower the risk of
> false positive matches to be fully as "technical" as something
> that would have obvious lower-level network consequences.
> Others --frankly especially those who see commercial advantage
> in getting single-letter names-- have argued that this position
> is just a policy decision in dis

RE: Single-letter names (was: Re: Update of RFC 2606 based on the recent ICANN changes?)

2008-07-04 Thread JFC Morfin
I feel that Edmon's report of the ICANN/GNSO point of view and the 
positions of James Seng are shared by most of the groups we relate 
with (Internet @large, open roots, ISO lobbies, Multilinc, MINC, 
Eurolinc, ISOC France, ccTLDs, etc.). If this WG does not think they 
are technically adequate there would certainly be a real urgency to 
document why, to have it confirmed by the IAB, and disseminated. This 
is due to the constraints a change would introduce outside of the 
Internet community and the général awareness of this debate after the 
Paris meeting. This WG needs to speak up now, or status quo will be 
considered as definitly settled.


I expect one single sign (logo) gcTLDs [geocultural] to be documented 
this year for multilingual information machines (airports, 
transports, health, kids, disabled). BTW this is also why I would 
recommend to refer to the semiotic rather than to the semantic aspects.

jfc

At 01:33 05/07/2008, Edmon Chung wrote:

Regarding single Unicode code-point labels at the TLD level, there was quite
some discussion on this topic at the GNSO Reserved Names working group and
then at the new gTLD discussion.  The final recommendation from the GNSO
was:

"Single and two-character U-labels on the top level and second level of a
domain name should not be restricted in general. At the top level, requested
strings should be analyzed on a case-by-case basis in the new gTLD process
depending on the script and language used in order to determine whether the
string should be granted for allocation in the DNS. Single and two character
labels at the second level and the third level if applicable should be
available for registration, provided they are consistent with the IDN
Guidelines."

As for ASCII, the recommendation was:
"We recommend reservation of single letters at the top level based on
technical questions raised. If sufficient research at a later date
demonstrates that the technical issues and concerns are addressed, the topic
of releasing reservation status can be reconsidered."

Edmon


___
Ietf mailing list
Ietf@ietf.org
https://www.ietf.org/mailman/listinfo/ietf


Re: Single-letter names (was: Re: Update of RFC 2606 based on the recent ICANN changes?)

2008-07-04 Thread John C Klensin


--On Friday, 04 July, 2008 15:01 -0400 William Tan
<[EMAIL PROTECTED]> wrote:

> John,
> 
> To add to your point, one should also consider the question of
> embedded semantics in a single-character label.
> 
> Alphabetic scripts such as Latin mostly represent sounds used
> to make up words. While one can certainly find some legitimate
> single-character words (such as the article "a" or the personal
> pronoun "i") and dream up others, it would not be very
> convincing in the face of your explanation #3.

Agreed.
 
> On the other hand, characters in ideographic scripts such as
> Han are not mere sounds or glyphs; they represent one or more
> concepts. Therefore, a single-ideographic-character TLD label
> is certainly more justifiable. I would even go as far as to
> suggest that it is essential in many cases. For example, "猫"
> (U+ 732B) in both Simplified Chinese and Japanese means "cat"
> as in English, not the abbreviation for Catalan nor the UNIX
> command. The reverse translation of "cat" yields the exact
> character in Simplified Chinese, though in Japanese sometimes
> the Hiragana sequence "ねこ" is also used. One would be
> hard-pressed to come up with a different character to
> represent the same concept in Han, aside from the traditional
> Chinese counterpart "??" (U+8C93).

Yes.   As I tried to indicate, I was trying to be brief and
obviously left some things out as a result.  While I agree with
what you say above, it also opens another question.   I'm not
quite ready to agree with the often-expressed principle that
people have some "right" to register particular names.  For
example, IBM clearly owns a well-known mark "ibm".  That gives
them some rights --in trademark law, rather than the DNS-- to
prevent anyone else from using the string, at least in ways that
would create confusion.  But it doesn't give them any inherent
"rights" to register the name in the DNS.  In this specific
case, while I don't see any reason to ban
single-"ideographic"-letter TLDs, I also don't believe that the
fact that U+732B, by itself, means "cat" creates any intrinsic
right to register it in the DNS.   If there were a compelling
reason to ban single-letter ideographic TLDs, I would not
consider your "cat" example to be particularly compelling
because I don't believe there is a "right" to a TLD for cats or
the equivalent.

That distinction is important because I think it quite likely
that as we look at other alphabetic scripts with relatively
small numbers of characters, we are quite likely to find some
where more, and more significant, words are spelled with only
one character than is the case with Western European languages.
And I believe the rule for those scripts, for the reasons given
in my earlier note, should be "no single-letter domains", not
"no single-letter domains unless one can find a dictionary
entry".


> I don't know what the present thinking is on the idea of
> non-semantic TLDs, but IMHO the social expectations of DNS
> usage is cast in stone. Jon's idea would simply shift the
> semantics to the second level, thereby creating 24 roots
> instead of a single "."

As I indicated, I think that particular idea is no longer
relevant (if it ever was).  I'm happy to engage in speculation
about whether it could ever have worked, but only in  the
presence of strong drink.

 john


___
Ietf mailing list
Ietf@ietf.org
https://www.ietf.org/mailman/listinfo/ietf


Single-letter names (was: Re: Update of RFC 2606 based on the recent ICANN changes?)

2008-07-04 Thread John C Klensin
Vint,

In the ASCII space, there have been three explanations offered
historically for the one-character prohibition on top and
second-level domains.   I've written variations on this note
several times, so will just try to summarize here.  Of the
three, the first of these is at best of only historical interest
and may be apocryphal and the second is almost certainly no
longer relevant.  The third remains significant.

(1) Jon has been quoted as suggesting that we could have
eliminated many of the problems we now face with TLDs and
simultaneously made the "no real semantics in TLD names" rule
much more clear had we initially allocated "b".."y" as TLDs.
Then, when someone asked for an assignment, it would have been
allocated at random to one of those domains.  While this has a
certain amount of appeal, at least in retrospect, there is
probably no way to get from where we are today to that model...
unless actions taken in the near future so ruin the current DNS
tree as a locus for stable and predictable references that we
need to start over with a new tree.  I don't think that a "have
to start over" scenario is at all likely, but I no long believe
it to be impossible.

(2) There was an idea floating around for a while that, if some
of the popular TLDs "filled up", one could create single-letter
subdomains and push subsequent registrations down the tree a
bit.  For example, if .COM were declared "full", then "a.com",
"b.com", etc., would be allocated and additional reservations
pushed into subdomains of those intermediate domains rather than
being registered at the second level.  Until and unless the
conventional wisdom that adding more names to .COM merely
requires more hardware  and/or bandwidth, that won't be a
"filled up" point at which this sort of strategy could be
triggered.  Worse, trying to use single-letter subdomains as an
expansion mechanism would raise political issues about putting
latecomers at an advantage that would be, IMO, sufficient to
completely kill the idea.  In the current climate, I think the
community would decide that it preferred a disfunctional DNS if
that were ever the choice (see the "start over" remark above).

(3) At least in the discussions that led up to RFC 1591, and
probably much earlier, there were concerns about reducing the
likelihood of false hits if the end user made single-character
typing errors.  With only 26 (or maybe 36) possible characters,
it could just about be guaranteed that all of them would be
registered and that _any_ typing error would yield a false
match.  That, in itself, has been considered sufficient to
prohibit single-letter labels and, by extension, to be fairly
careful about two-letter ones.   There have been arguments on
and off over the years as to whether this is a "technical"
reason or an attempt to set policy.  Even though the mismatches
would obviously not cause the network to explode or IP to stop
working, at least some of us consider the informational
retrieval and information theoretic reasons to insist on more
information in domain name labels in order to lower the risk of
false positive matches to be fully as "technical" as something
that would have obvious lower-level network consequences.
Others --frankly especially those who see commercial advantage
in getting single-letter names-- have argued that this position
is just a policy decision in disguise.

Note that, with slight modifications, the second and third
arguments apply equally well to TLD allocations and to SLD
allocations, especially in popular domains.  

The reasoning associated with the third case also applies to any
other script that contains a fairly small number of characters.
One could manage a long philosophical discussion as to whether
there are sufficient characters in the fully-decorated
Latin-derived collection to eliminate the problem, but an
analysis of keyboard and typing techniques/ input methods for
that range of characters would, IMO, yield the same answer --
single-letter domains are just not a good idea and two-letter
ones near the top of the tree should be used only with great
caution.   

On the other hand, the same reasoning would break down when
confronted with a script that contains thousands of characters,
such as the "ideographic" ones.  There are enough characters
available in those scripts that one can presumably not worry
about single-character typing errors (and one can perhaps worry
even less if the usual input methods involve typing
phonetically, using a different script, and then selecting the
relevant characters from a menu -- in those cases, the phonetic
representations are typically more than a character or two long
and the menu selection provides an extra check about false
matches).

 john



--On Thursday, 03 July, 2008 19:04 -0400 Vint Cerf
<[EMAIL PROTECTED]> wrote:

> seems odd to me too, James.
> 
> vint
> 
> 
> On Jul 3, 2008, at 6:14 PM, James Seng wrote:
> 
>>> At the moment, the condition is "no single Unicode code
>>> point."