Re: Will Language Wars Balkanize the Web?

2000-12-08 Thread Fred Baker

At 03:49 AM 12/8/00 +0859, Masataka Ohta wrote:
>However, they can't justify to call them internationalization.

precisely.




end to end (Re: Will Language Wars Balkanize the Web?)

2000-12-07 Thread Dave Crocker

At 06:21 PM 12/6/00 +, Graham Klyne wrote:
>BTW, the basic tenet of end-to-end connectivity of data and services is, I 
>think, satisfied by the IP layer.  Part of my question was about the 
>extent to which this end-to-end-ness needs to be duplicated at higher layers.

Not sure whether this is a distraction -- hence the modified Subject -- but 
I do NOT consider an end-to-end mechanism at one level to be sufficient, 
when talking about end-to-end at another level.

Lower layers must support the e2e requirements of the layer under 
discussion, but those lower layers do not satisfy the requirements by 
themselves.

If the layer under discussion, in this case the DNS application, does not 
support e2e, then the fact that IP does does not buy much.

d/


=-=-=-=-=
Dave Crocker  <[EMAIL PROTECTED]>
Brandenburg Consulting  
Tel: +1.408.246.8253,  Fax: +1.408.273.6464




Re: Internationalization and the IETF (Re: Will Language Wars Balkanize the Web?)

2000-12-07 Thread Matt Crawford

>  If the world had asked you or me to design an international
> language, I think either of us would have done better.

Don't be too sure.  Even today, there are no more speakers of
Esperanto than of Mayan.




Re: Internationalization and the IETF (Re: Will Language Wars Balkanize the Web?)

2000-12-07 Thread Vernon Schryver

> From: Henk Langeveld <[EMAIL PROTECTED]>
>
> You know, it isn't that long ago that I realised that for many Americans,
> "International" is synonymous with "Non-American".

That is as true as the observation that many who learn English as a
second language think that "international" is synonymous with using
the language of their few dozen million countrymen.  

It is a fact that the single international language of the late 20th and
early 21st is far more closely related to a subset of American English
than any other local language.  It is also a fact that only during my
lifetime has that odd situation developed.  If the world had asked you or
me to design an international language, I think either of us would have
done better.  But the first fact is all that matters.

If it makes your feel better, note that just as Latin was not exactly what
Italians spoke, the current international language is not exactly what is
spoken by citizens of the largest nation that calls itself The United
States of America (there are >1) and whose mother tongue is English.
Thanks to satellite TV and other forms of what the P.C. call cultural
imperialism, the modern difference are small, but they exist.


> From: Dave Crocker <[EMAIL PROTECTED]>

> >Diacritical marks are no different from Cyrillic, Arabic, Greek, Hebrew,
> >Sanskrit, and other non-Latin character sets in not being not part of
> >the international language.  The goal of communicating is to communicate,
> >not wave flags in support of national languages.
>
> In a sense, Harald's observation points out a case in which all those other 
> sets very much ARE part of the "international" language.

If those are part of your "international language," then what characters
are not part of it?  It is Polically Correct to pretend we all speak,
read, and write a single language, but also hopelessly silly.


> It does not matter whether readers understood the semantics of the strings; 
> they needed to be able to see them.
>  That is not national flag waving.
>  That is global utility.

"Global unity" is a matter of everyone being able to communicate with
everyone else.  It has not only has nothing to do with each of us using
our favorite set of glyphs, but goes against it.  Each of us using our
favorite language *internationally* is a real Tower of Babel.

Being able use strings is not only a matter of being able to type their
characters.  Those of us who have studied languages with alphabets other
than what learned while young have discovered that just as the human ear
has difficulty hearing sounds outside our mother tongues, the human eye
has trouble seeing foreign glyphs.  If they're not yours, all of those
diacritical marks look the same or are invisible.

There are good reasons why the international lingua francas of previous
millenia have forced people to transliterate their native writings
instead of importing them wholesale.  MIME and 8-bit domain names are
mechanisms for importing wholesale instead of transliterating.  They're
good *locally*, but not *internationally*.

> ...
> Technical standards work often gets distracted by trying to deal with 
> issues that are outside the scope of reasonable technical standards 
> work.  It should not be the task of such work to dictate or constrain users 
> to only socially acceptable behavior.  That is a social task, not a 
> technical one.

Yes.  So why do otherwise rational IETF particpants claim that
social and political notions such as "global unity" are somehow
related to MIME and IDN?

MIME and localized domain names are good and necessary, but only
locally or provincially, even when "locally" involves vast land
areas (e.g. Russian or Spanish) or billions of people.

> Choosing to send various types of data requires making decisions about the 
> context.  No technical standard can be designed to "automatically" 
> determine when it is, or is not, appropriate to send that data, whether it 
> is diacritical marks, kanji, or an excel spread sheet.  Even when the 
> sender has information about recipient capabilities, social factors affect 
> the choices.

Yes, so why do some MIME and *localized* domain name advocates claim
otherwise?  What is the pathology insisting that sending MIME to
international mailing lists makes sense?  Why do apparently rational people
claim that 8-bit binary domain are "international"?  Because they've been
infected with Political Correctness or because they don't want to dilute
political support among the unthinking for whatever they're advocating?


> ...
> At least the recipient has the unintelligible data well isolated and 
> labeled.  MIME did its job.

Yes, but the justification of the sender for using MIME to send
unitllitible data is crazy, since communication is averted while
resources, including the human recipient's time, are wasted.


> ...
> The question is whether a coherent extension to DNS will be done in a 
> fashion which will keep the DNS integrated, or whether

Re: Will Language Wars Balkanize the Web?

2000-12-07 Thread Masataka Ohta

Keith;

> > > you missed it. Suppose you could not exchange in commerce with a person of
> > > a given nationality, not because you did not have a language in common with
> > > him or her, but because your system could not interpret his or her name.
> > > That would mean that you could not spend money in that person's direction,
> > > because you could not communicate with him or her.
> > 
> > And it means that person is at a disadvantage in your marketspace, and
> > that it's not your problem.
> 
> why in the world do people think they can justify or not justify actions
> based on whether something is an advantage/disadvantage in some 
> "marketspace"?

They can justify them locally within local marketspaces, of course.

However, they can't justify to call them internationalization.

Masataka Ohta




Re: Will Language Wars Balkanize the Web?

2000-12-07 Thread Keith Moore

> > you missed it. Suppose you could not exchange in commerce with a person of
> > a given nationality, not because you did not have a language in common with
> > him or her, but because your system could not interpret his or her name.
> > That would mean that you could not spend money in that person's direction,
> > because you could not communicate with him or her.
> 
> And it means that person is at a disadvantage in your marketspace, and
> that it's not your problem.

why in the world do people think they can justify or not justify actions
based on whether something is an advantage/disadvantage in some 
"marketspace"?

Keith




Re: Internationalization and the IETF (Re: Will Language Wars Balkanize the Web?)

2000-12-07 Thread Theodore Y. Ts'o

   Date: Thu, 07 Dec 2000 07:23:11 -0500
   From: Dave Crocker <[EMAIL PROTECTED]>

   At least the recipient has the unintelligible data well isolated and 
   labeled.  MIME did its job.

Indeed.  If I get a mail message which is in HTML only, 99.97% of the
time it's SPAM-mail.  And I've lost count of how many time I've received
Chinese (or other Asian language) SPAM-mail.  In fact, I'm seriously
thinking about coding up a rule which automatically junks HTML mail
unread. 

I guess MIME is useful for something.  :-)


- Ted




Re: Internationalization and the IETF (Re: Will Language Wars Balkanize the Web?)

2000-12-07 Thread John Stracke

Keith Moore wrote:

> Furthermore, a
> great many people use multiple languages (not necessarily including
> English) is, so that a given person, host, or subnetwork will often
> need to exist in multiple (potentially competing) locales at once.

Sometimes even in the same sentence.  My mother grew up partly in Quebec;
when she's talking to her siblings, they'll often use French words when the
English ones don't come to mind immediately.

--
/==\
|John Stracke| http://www.ecal.com |My opinions are my own.|
|Chief Scientist |=|
|eCal Corp.  |How many roads must a man walk down before he|
|[EMAIL PROTECTED]|admits he is LOST?   |
\==/






Re: Will Language Wars Balkanize the Web?

2000-12-07 Thread Graham Klyne

At 08:15 AM 12/4/00 -0500, Dave Crocker wrote:
>On the other hand, this thread was triggered by Graham's question about 
>the negative impact of partitioning.  The postal example would seem to 
>show that the effect is not so bad.
>
>Except I would claim that it is not partitioning.  Note that an address 
>always has a global representation, in addition to a possibly different 
>local one.

You're right, it's not strictly partitioning...

>Perhaps that can reconciled as easily as claiming that any 'local' domain 
>name must also have a global form? (But, somehow, the word "scaling" gets 
>in the way of believing that.)

... when I asked that question, I had in mind something like Tim Berners 
Lee presented about at the WWW5 conference in 1996, in which connectivity 
between communities might be seen as having a fractal structure, with 
groupings and lines of communication between groups visible at a range of 
scales.  I think this is, in part, how people achieve flexible scalability 
in their communications.  (Similar patterns also arise in natural phenomena).

#g
--

BTW, the basic tenet of end-to-end connectivity of data and services is, I 
think, satisfied by the IP layer.  Part of my question was about the extent 
to which this end-to-end-ness needs to be duplicated at higher layers.



Graham Klyne
([EMAIL PROTECTED])




Re: Internationalization and the IETF (Re: Will Language Wars Balkanize the Web?)

2000-12-07 Thread Keith Moore

The notion that use of languages other than English can or should be 
'localized' strikes me as both shockingly arrogant and hopelessly naive.  

People can and will use their own languages on the Internet - in email, 
on the web, and in domain names, and without regard to their location
in either the physical world, the currently topology of the network,
or the TLD of the host they are using at the moment.  Furthermore, a 
great many people use multiple languages (not necessarily including 
English) is, so that a given person, host, or subnetwork will often 
need to exist in multiple (potentially competing) locales at once.

And while a great many people - who speak only a single langauge, 
and whose travels are confined to a small geographic area where
others speak only that language - might indeed be happy with a 
localized solution, adoption of purely localized solutions would 
impair the vast number of people who do not fall into that category.

The question is not whether people will use non-ASCII characters in
domain names, but whether the various uses of non-ASCII characters
will coexist peacefully with each other and with existing applications,
and whether applications will continue to interoperate with one another.

So while it's quite important that IDNs be able to be represented
in ASCII for compatibility with existing applications and the large
number of protocols that use DNS names as protocol elements, and 
even though we all understand that pure-ASCII, Romanized versions
of non-English names will continue to enjoy wide use -- we still need
to produce an IDN standard as soon as possible.

Fortunately the IDN group is making very good progress, and I'm 
confident that consensus around a concrete proposal will soon emerge.

Keith




Re: Will Language Wars Balkanize the Web?

2000-12-07 Thread Fred Baker

At 06:06 PM 12/3/00 -0500, Betsy Brennan wrote:
>But the Internet is not the postal system nor the phone system. We already
>have the postal system and the phone system.  They may be slower, but does
>that mean they should be replaced or that the Internet must duplicate what
>these systems do? BLB

you missed it. Suppose you could not exchange in commerce with a person of 
a given nationality, not because you did not have a language in common with 
him or her, but because your system could not interpret his or her name. 
That would mean that you could not spend money in that person's direction, 
because you could not communicate with him or her. Although IP datagrams 
could get from you to him/her, there would not be a good way to determine 
what address to send them to. That would be pretty tough.




Re: Internationalization and the IETF (Re: Will Language Wars Balkanize the Web?)

2000-12-07 Thread Dave Crocker

At 01:58 AM 12/7/00 -0700, Vernon Schryver wrote:
> > From: Harald Alvestrand <[EMAIL PROTECTED]>
> > it may have escaped the notice of some that a fair bit of the 
> discussion on
> > diacritcs was carried out using live examples,
>
>Diacritical marks are no different from Cyrillic, Arabic, Greek, Hebrew,
>Sanskrit, and other non-Latin character sets in not being not part of
>the international language.  The goal of communicating is to communicate,
>not wave flags in support of national languages.


In a sense, Harald's observation points out a case in which all those other 
sets very much ARE part of the "international" language.

The live examples were a) intended, b) appropriate, and c) successful.

It does not matter whether readers understood the semantics of the strings; 
they needed to be able to see them.

 That is not national flag waving.

 That is global utility.


> > MIME character sets is an example of a battle fought and won.
>
>When MIME is used to pass special forms among people whose common
>understandings including more or other than ASCII, MIME is a battle
>fought and won.
>When MIME is used to send unintelligible garbage, it is a battle fought
>and lost.

Technical standards work often gets distracted by trying to deal with 
issues that are outside the scope of reasonable technical standards 
work.  It should not be the task of such work to dictate or constrain users 
to only socially acceptable behavior.  That is a social task, not a 
technical one.

Choosing to send various types of data requires making decisions about the 
context.  No technical standard can be designed to "automatically" 
determine when it is, or is not, appropriate to send that data, whether it 
is diacritical marks, kanji, or an excel spread sheet.  Even when the 
sender has information about recipient capabilities, social factors affect 
the choices.

 So sending such data in MIME inappropriately
 is STILL an example of a battle fought and won.

At least the recipient has the unintelligible data well isolated and 
labeled.  MIME did its job.



At 08:19 AM 12/6/00 -0500, vint cerf wrote:
>Even if we introduce extended character sets, it seems vital
>that there be some form of domain name that can be rendered
>(and entered) as simple IA4 characters to assure continued
>interworking at the most basic levels. This suggests that
>there is need for some correspondence between an IA4 Domain
>Name and any extended characterset counterpart.


The same task is at issue for the DNS as it was for MIME.  We need a 
mechanism for labeling and encoding DNS strings and, I believe, we need it 
to be added to the existing DNS.

 Users of those strings will be all over the world,
 not just in a particular locale.

The need for this capability is massive and immediate.

There WILL be a solution deployed.  In fact there already is.

The question is whether a coherent extension to DNS will be done in a 
fashion which will keep the DNS integrated, or whether this requirement 
produces an independent DNS.

 That's not flag-waving.

 That's multiple DNS namespaces.

We need to be careful to distinguish two different requirements.  One is 
for a mechanism to encode domain names in non-ascii character sets.  The 
second is for an equivalence mapping from non-ascii domain names into ascii 
domain names.  The former is so that the technical and operational aspects 
of the DNS remain coherent.  The latter is so that everyone has a way to 
reach a particular domain, even if they cannot generate the non-ascii form 
of the name.

The extreme form of the latter task involves ascii encodings that are 
"comfortable" for human users; that requirement is not solved in human 
non-technical situations.  I believe that the example of alternate choices 
of "jin" and "gin" as representations for some chinese character(s) was 
used.  Hence this extreme form of the task is not going to be solved by 
lowly IETF protocol designers.

At best, use of ACE-like encodings permits an ascii representation, albeit 
one that is "uncomfortable".  It is as far as the IETF should go in trying 
to permit a "universally accessible" form for all domain names.

Interestingly, we do not need to have all domain names exchange and stored 
in an ACE form, forever.  Just as MIME is able to support pure binary 
encodings, so can the DNS.  The ACE form can be mapped to when needed.

d/

=-=-=-=-=
Dave Crocker  <[EMAIL PROTECTED]>
Brandenburg Consulting  
Tel: +1.408.246.8253,  Fax: +1.408.273.6464




Will Language Wars Balkanize the Web? ICANN timing

2000-12-07 Thread Dan Kolis



James Salsman [EMAIL PROTECTED] said: 
>I don't know why ICANN would want to bring such a heavy burden
>upon themselves in an area of such flux so soon, when they have 
>so much else that they have already committed to do.


Dan K says:
Well, the actual announcement at ICANN.Org doesn't really even hint at a
timetable.

I think its got to happen, (IDN DNS). But perhaps not too quickly. 

I think, the registrars that are pushing the envelope to apply this
prematurely are doing there customers somewhat of a dissservice.

Then again testing has to happen sooner or later.

I don't think there is anyone that wants to trade off doing it quickly for
doing it right. Getting more non technocrats involved would help to judge
the reality check parts of all of this. But that seems hard to arrange on a
meaningful scale without committing to the whole thing.

Regs,
Dan Kolis
 




Re: Internationalization and the IETF (Re: Will Language Wars Balkanize the Web?)

2000-12-07 Thread Vernon Schryver

> From: Harald Alvestrand <[EMAIL PROTECTED]>

> >The same thinking that says that MIME Version headers make sense in
> >general IETF list mail also says that localized alphabets and glyphs must
> >be used in absolutely all contexts, including those that everyone must
> >use and so would expect to be limited to the lowest common denominator.
>
> it may have escaped the notice of some that a fair bit of the discussion on 
> diacritcs was carried out using live examples, and while I am sure there 
> were some who did not see the diacritics on screen, at least there was a 
> single definition of how to get from what was sent on the wire to what 
> might have been displayed on the screen, and MANY of the participants 
> actually saw them correctly displayed.

Diacritical marks are no different from Cyrillic, Arabic, Greek, Hebrew,
Sanskrit, and other non-Latin character sets in not being not part of
the international language.  The goal of communicating is to communicate,
not wave flags in support of national languages.  When you are trying
to talk to strangers and have no clue about their languages, you are a
fool to not use the common, international language, no matter how poor
and ugly it is.


> MIME character sets is an example of a battle fought and won.

When MIME is used to pass special forms among people whose common
understandings including more or other than ASCII, MIME is a battle
fought and won.

When MIME is used to send unintelligible garbage, it is a battle fought
and lost.  Whether the garbage is HTML, the latest word processing
format from Redmond or a good representation of the mother tongue of
1,000,000,000 people is irrelevant to whether the use of MIME is wise
or foolish.  If the encoding is not known before hand to be intelligible
to its recipients, then the use of MIME is foolish.

MIME is a good *localization* mechanism, either in geography or culture
or in computer applications (e.g. pictures or sound).

The continuing IETF efforts to extend MIME to include yet more extra or
special forms in the vague hope that the recipient will surely be able to
interpret at least one is probably the best of what we can expect from
"internationalized" domain names in 2 or 3 years.  Unless something like
Vint Cerf's principle of encoding *localized* domain names in ASCII is
followed, the IDN efforts will at best repeat the history of MIME email
exemplified by the many Microsoft MIME formats.

In MIME, except in special cases, the "universal" form of the body is
either sufficient and the fancy versions useless wastes of cycles, storage,
and bandwidth, or the "universal" form can only say "sorry, better upgrade
your system."  Just as in the vast majority of HTML+ASCII email where
there is can be no useful difference and there is rarely a visible
difference between the ASCII plaintext and the HTML encrypted version,
*localized* domain names will either be unusable outside their native
provinces or they will be usable with a 7-bit ASCII keyboard.


Vernon Schryver[EMAIL PROTECTED]




Internationalization and the IETF (Re: Will Language Wars Balkanize the Web?)

2000-12-07 Thread Harald Alvestrand

At 15:35 06/12/2000 -0700, Vernon Schryver wrote:
>The same thinking that says that MIME Version headers make sense in
>general IETF list mail also says that localized alphabets and glyphs must
>be used in absolutely all contexts, including those that everyone must
>use and so would expect to be limited to the lowest common denominator.

it may have escaped the notice of some that a fair bit of the discussion on 
diacritcs was carried out using live examples, and while I am sure there 
were some who did not see the diacritics on screen, at least there was a 
single definition of how to get from what was sent on the wire to what 
might have been displayed on the screen, and MANY of the participants 
actually saw them correctly displayed.

MIME character sets is an example of a battle fought and won.

--
Harald Tveit Alvestrand, [EMAIL PROTECTED]
+47 41 44 29 94
Personal email: [EMAIL PROTECTED]




Re: Will Language Wars Balkanize the Web? & P.S. Eudora/PalmOS

2000-12-06 Thread James P. Salsman

Masataka Ohta and Vernon Schryver make excellent points in favor 
of the domain name status quo.  I agree that IDN should be frozen 
for at least a few years to see what local domain admins and 
application vendors tend to do, especially since the pieces of 
the likely solutions (such as the competing UTF-8 encodings) are 
so still so new and somewhat under development.

I don't know why ICANN would want to bring such a heavy burden
upon themselves in an area of such flux so soone, when they have 
so much else that they have already committed to do.

This thread reminded me of these news items, only two days apart:

http://abcnews.go.com/sections/travel/DailyNews/FrenchintheSkies000404.html

http://abcnews.go.com/sections/travel/DailyNews/BacktoFrenchinSkies000406.html

Cheers,
James

P.S.  By the way, on my usual topic of wireless asynchronous voice 
messaging, here is a news article in which Qualcomm founder and chief 
Irwin Jacobs asserts that "voice-enabled capabilities" "could prove 
popular" on third-generation mobile phones:

  http://biz.yahoo.com/rf/001206/hkg15073_2.html

I suppose Irwin Jacobs is the person to ask for MIME audio attachment 
record and play in Eudora email on the PalmOS.  Please ask in person 
if you see him in San Diego!




Re: Will Language Wars Balkanize the Web?

2000-12-06 Thread Vernon Schryver

> From: Masataka Ohta <[EMAIL PROTECTED]>

> ...
> > (Can we please move this discussion to the IDN list, where it
> > belongs?)
>
> The point is that IDN WG is purposeless and is wrong to exist. Of
> course, it is waste of time to discuss it in IDN list

Masataka Ohta is raising a point of order, and from what I've seen of
other "internationalization" efforts, it is probably more valid than
not.  That the IETF's effort nominally involves "internationalization"
instead of "localization" is bad sign.

Since I first encountered "internationalization" hassles in the late
1970's in making an ASCII+EBCDIC system behave tolerably for people
typing and reading Arabic and Hebrew text, I've found that
"internationalization" is both technically hard and incredibly Politically
Correct.  Some people like to hoist standardized flags that today bear
"Respect for Diversity" and start marching over cliffs--no, that's wrong.
In Politically Correct issues, the standards bearers tell everyone else
to march over the cliff while they stand to attention nearby.

Once an "internationalization" organization gets started, it *never*
stops, no matter how many of the original participants get wise
and quit, what obviously false premise is required to justify the
latest conclusion, nor what damage has already been done (not to
mention contemplated) in the product, standard, protocol, or whatever
justifies the existence of the internationalization organization.
"Is the new version equally and completley useless for both domestic
and overseas users?--Great, let's fix the next one."

It took me about 10 years and more than one "internationalization"
organization to reach that politically incorrect conclusion.


> ...
> If people want local names let them have them under local domains,
> with all the local conventions on encoding and everything.
>
> The administrator of the local domains may or may not force people
> have additional internatinalized domain names.
>
> Note that local, here, means culturally (not necessarily geographically)
> local that ccTLDs may or maynot be the local domains.
>
> But, it can be said that gTLDs are not a proper place to put local
> names.

The same thinking that says that MIME Version headers make sense in
general IETF list mail also says that localized alphabets and glyphs must
be used in absolutely all contexts, including those that everyone must
use and so would expect to be limited to the lowest common denominator.
When confronted with fact that ANSI X3.4 (ASCII) is a provincial U.S.
variant of an international standard, otherwise rational people flinch
and claim that sending anything but 7-bit ASCII to major IETF lists is
not merely an unthinking waste of bandwidth but must be supported and
encouraged.  They justify such nonsense with talk like:

]diversity of list
] contributors' networking interests and experience (culture), which include
] people who happen to find it cost-effective to use such things as
] formatting and unusual character sets in their email. MIME is as much a
] part of the Internet culture as any standard 

(apologies to the author of that private message)

It is a mystery to me why otherwise reasonable people who would never
dream of imposing their own idiosyncracies on everyone else demand that
others not only be allowed but encouraged to do so.

In other words, people have trouble understanding that
"internationalization" necessarily means restricting to the lowest
common internatational denominatior instead of the impossible goal of
simultaneously supporting absolutely all possible languages and glyphs.


Vernon Schryver[EMAIL PROTECTED]




Re: Will Language Wars Balkanize the Web?

2000-12-06 Thread Masataka Ohta

John;

> (Can we please move this discussion to the IDN list, where it
> belongs?)

The point is that IDN WG is purposeless and is wrong to exist. Of
course, it is waste of time to discuss it in IDN list. So, the
only reasonable reaction is to ignore it (I dropped improper CC:).

The only necessary discussion on domain names, IF ANY, is
localization issues, for which there is no specific WG of IETF.

> (iii) Regardless of how the names in the DNS are coded, it is
> important to have analogies to "two sided business cards".

A typical business card of Japanese have Chinese characters.

When we internatinalize it, we use the other side to put a Lain
character version.

As we already have fully internatinalized DNS with Latin
characters, Chinese characters in DNS is localization against
internationalization.

> And, because of the
> registration issue, there is no plausible way to impose a
> requirement that every host (or other DNS entry) have a name in
> ASCII if it has a name in some other script: people and hosts
> not visible outside their own countries may not care enough to
> go to the trouble.

That are local issues.

If people want local names let them have them under local domains,
with all the local conventions on encoding and everything.

The administrator of the local domains may or may not force people
have additional internatinalized domain names.

Note that local, here, means culturally (not necessarily geographically)
local that ccTLDs may or maynot be the local domains.

But, it can be said that gTLDs are not a proper place to put local
names.

Masataka Ohta




RE: Will Language Wars Balkanize the Web?

2000-12-06 Thread Hongwei

I can't agree more.

-Original Message-
From: John C Klensin [mailto:[EMAIL PROTECTED]]
Sent: 06 December 2000 16:46
To: vint cerf
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: Will Language Wars Balkanize the Web?


(Can we please move this discussion to the IDN list, where it
belongs?)

--On Wednesday, 06 December, 2000 08:19 -0500 vint cerf
<[EMAIL PROTECTED]> wrote:

> Mr. Ohta has put his finger on a key point: ability of all
> parties to generate email addresses, web page URLs and so on.
> Even if we introduce extended character sets, it seems vital
> that there be some form of domain name that can be rendered
> (and entered) as simple IA4 characters to assure continued
> interworking at the most basic levels. This suggests that
> there is need for some correspondence between an IA4 Domain
> Name and any extended characterset counterpart.

Vint,

I think I agree with the principle.  However, there are several
different models with which the "correspondence" can be
implemented.  The difference among them is quite important
technically --implementations would need to occur in different
places and with different implications, deployment times, and
side effects--  and perhaps as important philosophically.  E.g.,
let me try to identify some of the in extreme form to help
identify the differences:

(i) The names in the DNS are "protocol elements".  They should
be expressed in a minimal subset of ASCII so that they can be
rendered and typed on almost all of the world's equipment (the
assumption that, e.g., all Chinese or Arabic keyboards and
display devices in the medium to long term will contain Roman
characters seems a little dubious).  There is no requirement
that they be mneumonic in any language: in principle, a string
containing characters selected at random would do as well as the
name of a company, person, or product.

This model gives rise to directory and keyword systems (most of
them outside the DNS) that contain the names that people use.
While the registration and name-conflict problems are
non-trivial, names in multiple languages and character codings
can easily map onto a single DNS identifier.  On the other hand,
binding a national-language name to an ASCII name would need to
be done either by parallel registrations or by matching on
keywords (and the latter might not yield unambiguous and
accurate results).

(ii) Entries in the DNS are always coded.  After all, "ASCII" is
just a code mapping between a human-visible character set and a
machine (or wire) representation.  It is the job of an
application to get from "characters" to "codes" and back, and to
recognize coding systems and applying the correct decodings.
And software that is old or broken will simply display a
different rendering of the coded form (whether that is a
"hexification" such as Base64 or some other system).  

This model gives rise to the "ACE all the way up" models, in
which non-ASCII names are placed in the DNS using some tagging
system, but the "ASCII representation" of a name that, in the
original, uses non-Roman characters, may be quite ugly and bear
no connection with the name as it would be rendered using the
original characters other than an algorithmic one.   It also
gives rise to some of the UTF-8 models, on the assumption that
applications that can't handle the full IS 10646 character set
can do something intelligent.

(iii) Regardless of how the names in the DNS are coded, it is
important to have analogies to "two sided business cards".  Such
systems assume that any name rendered in a non-Roman character
set should have an analogue in Roman characters.  And those
analogues are expected to be bound to the original form by
transliteration or translation -- they aren't just random,
algorithmically matching, strings.

While there need to be facilities for the non-Roman (even
non-ASCII) characters in either the DNS or a directory,
establishing the "ASCII names" is, of necessity, a registration
issue rather than an algorithmic issue.  We don't know how to do
the "translation" (or, in the general case, even
transliteration) algorithmically.   To give one example, despite
the "Han unification" of IS 10646, the characters on a Japanese
business card for you would almost certainly be different from
those on a Chinese business card for you.And, because of the
registration issue, there is no plausible way to impose a
requirement that every host (or other DNS entry) have a name in
ASCII if it has a name in some other script: people and hosts
not visible outside their own countries may not care enough to
go to the trouble.

These models are not mutually exclusive.  But they are
definitely different perspectives.

It is also worth noting that, as a matter of perspective, the
dominance of subsets of ASCII in these debates has some

Re: Will Language Wars Balkanize the Web?

2000-12-06 Thread John C Klensin

(Can we please move this discussion to the IDN list, where it
belongs?)

--On Wednesday, 06 December, 2000 08:19 -0500 vint cerf
<[EMAIL PROTECTED]> wrote:

> Mr. Ohta has put his finger on a key point: ability of all
> parties to generate email addresses, web page URLs and so on.
> Even if we introduce extended character sets, it seems vital
> that there be some form of domain name that can be rendered
> (and entered) as simple IA4 characters to assure continued
> interworking at the most basic levels. This suggests that
> there is need for some correspondence between an IA4 Domain
> Name and any extended characterset counterpart.

Vint,

I think I agree with the principle.  However, there are several
different models with which the "correspondence" can be
implemented.  The difference among them is quite important
technically --implementations would need to occur in different
places and with different implications, deployment times, and
side effects--  and perhaps as important philosophically.  E.g.,
let me try to identify some of the in extreme form to help
identify the differences:

(i) The names in the DNS are "protocol elements".  They should
be expressed in a minimal subset of ASCII so that they can be
rendered and typed on almost all of the world's equipment (the
assumption that, e.g., all Chinese or Arabic keyboards and
display devices in the medium to long term will contain Roman
characters seems a little dubious).  There is no requirement
that they be mneumonic in any language: in principle, a string
containing characters selected at random would do as well as the
name of a company, person, or product.

This model gives rise to directory and keyword systems (most of
them outside the DNS) that contain the names that people use.
While the registration and name-conflict problems are
non-trivial, names in multiple languages and character codings
can easily map onto a single DNS identifier.  On the other hand,
binding a national-language name to an ASCII name would need to
be done either by parallel registrations or by matching on
keywords (and the latter might not yield unambiguous and
accurate results).

(ii) Entries in the DNS are always coded.  After all, "ASCII" is
just a code mapping between a human-visible character set and a
machine (or wire) representation.  It is the job of an
application to get from "characters" to "codes" and back, and to
recognize coding systems and applying the correct decodings.
And software that is old or broken will simply display a
different rendering of the coded form (whether that is a
"hexification" such as Base64 or some other system).  

This model gives rise to the "ACE all the way up" models, in
which non-ASCII names are placed in the DNS using some tagging
system, but the "ASCII representation" of a name that, in the
original, uses non-Roman characters, may be quite ugly and bear
no connection with the name as it would be rendered using the
original characters other than an algorithmic one.   It also
gives rise to some of the UTF-8 models, on the assumption that
applications that can't handle the full IS 10646 character set
can do something intelligent.

(iii) Regardless of how the names in the DNS are coded, it is
important to have analogies to "two sided business cards".  Such
systems assume that any name rendered in a non-Roman character
set should have an analogue in Roman characters.  And those
analogues are expected to be bound to the original form by
transliteration or translation -- they aren't just random,
algorithmically matching, strings.

While there need to be facilities for the non-Roman (even
non-ASCII) characters in either the DNS or a directory,
establishing the "ASCII names" is, of necessity, a registration
issue rather than an algorithmic issue.  We don't know how to do
the "translation" (or, in the general case, even
transliteration) algorithmically.   To give one example, despite
the "Han unification" of IS 10646, the characters on a Japanese
business card for you would almost certainly be different from
those on a Chinese business card for you.And, because of the
registration issue, there is no plausible way to impose a
requirement that every host (or other DNS entry) have a name in
ASCII if it has a name in some other script: people and hosts
not visible outside their own countries may not care enough to
go to the trouble.

These models are not mutually exclusive.  But they are
definitely different perspectives.

It is also worth noting that, as a matter of perspective, the
dominance of subsets of ASCII in these debates has some
important technical advantages (e.g., the code set can be made
very small and the canonicalization/matching rules are
algorithmic, universally-agreed, and trivial), but it is also
significantly an historical accident.  Because of that
historical accident, we tend to couch these discussions (as your
note does and as I have done above) in terms of ASCII <->
something-else mappings.  But it isn't hard to imagi

Re: Will Language Wars Balkanize the Web?

2000-12-06 Thread vint cerf

Mr. Ohta has put his finger on a key point: ability of all
parties to generate email addresses, web page URLs and so on.
Even if we introduce extended character sets, it seems vital
that there be some form of domain name that can be rendered
(and entered) as simple IA4 characters to assure continued
interworking at the most basic levels. This suggests that
there is need for some correspondence between an IA4 Domain
Name and any extended characterset counterpart.

Vint

At 07:32 PM 12/6/2000 +0859, you wrote:
>And, if a mailto URL is within a webpage with a chinese character
>anchor, it does not matter whether a mail address in the URL
>consists of pure ASCII characters or not.
>
>> It's worth nothing that my computer could handle the address if I can't.
>
>You properly understand that the current ASCII DNS is already
>fully internationalized.




Re: Will Language Wars Balkanize the Web?

2000-12-06 Thread Masataka Ohta

Claus;

> vint cerf <[EMAIL PROTECTED]> schrieb/wrote:
> > Incorporating other character sets without deep technical
> > consideration will risk the inestimable value of interworking across
> > the Internet. It CAN be done but there is a great deal of work to make
> > it function properly.
> 
> How do I type chinese characters? I can't. So I can't write mail to  
> someone whose email address contains non-ASCII characters if I don't  
> already have the address in electronic form (e.g. within a webpage).

Right.

And, if a mailto URL is within a webpage with a chinese character
anchor, it does not matter whether a mail address in the URL
consists of pure ASCII characters or not.

> It's worth nothing that my computer could handle the address if I can't.

You properly understand that the current ASCII DNS is already
fully internationalized.

Masataka Ohta




Re: Will Language Wars Balkanize the Web?

2000-12-05 Thread Masataka Ohta

Ran;

> At 02:53 05/12/00, Martin J. Duerst wrote:
> >At 00/12/04 10:42 -0800, Christian Huitema wrote:
> >>So, at a minimum, we need an IETF
> >>specification on how to detect that a domain name part is using a non ascii
> >>encoding, so that DNS servers don't get lost.
> >
> >Why not just use UTF-8? It is an encoding of the UCS (aka
> >Unicode/ISO 10646), the encoding is fully compatible with
> >ASCII (all 7-bit bytes are ASCII and only ASCII), and it
> >is IETF policy (RFC 2277).
> 
> All,
> 
> Please MOVE this conversation to the IDN WG list,
> where it would be in scope. Btw, this specific question
> has been raised and answered several times now on the IDN list.
> I encourage folks to read the sundry IDN proposals before
> diving in any deeper here.

IDN is the perfect place for repeated silly conversation like above.

But it is not the place to discuss localized domain name, which
has nothing to do with internationalization.

Masataka Ohta




Re: Will Language Wars Balkanize the Web?

2000-12-05 Thread Patrik Fältström

At 18.05 +0900 00-12-05, Martin J. Duerst wrote:
>ACE is (maybe) for machines. It's not primarily intended for humans.
>We may have ACE all the way (including TLD). It might be usable as a
>poor man's ASCII equivalent, but I strongly doubt that anybody will
>want to have it on the Latin side of their name card.

I would, because I know that people in many parts of the world don't 
know how to enter "sömos" on their keyboard, and if I register the 
domain "snömos.se", I really want people to be able to get to

   http://www.snömos.se

so, if I think it is perfectly alright to have

   http://www.bq--abzw55tnn5zq.se

on my buissnes card (aswell).

paf


-- 




Re: Will Language Wars Balkanize the Web?

2000-12-05 Thread Eric Brunner

Martin,

I'll send you a copy of the "@sign vs !path" debate from my USENIX papers
archive. See "Pathalias: or The Care and Feeding of Relative Addresses" by
Honeyman and Bellovin, undated, at http://www.uucp.org/papers/pathalias.pdf.

Speculations on the general utility and availability of "single" encoding
schemes or some approximation of limited ambiguity code-set mapping(s)
should not displace actual data. The claim that iso10646 is "good" is not
improved by non-reference to the costs and benefits of ASCII-colliding
encodings (EBCDIC, SJIS, etc.), just as the "interoperability" claim is
not improved by non-reference to the operational deployment of serviceable
encoding.

Ignoring the daft peculiarities of particular encodings (and ANSI C) such
as NULLs in strings (or file names), what I learned from owning the i18n
problem at Sun was that a program of code-set indepence had time-to-market,
sustaining engineering, and ease of implementation arguements over a program
of opportunistic code-set dependence (the industry standard practice), and
as a matter of convience, that the XPG/3 locale model made a utf8 locale a
minor cost item, and an interal convenience mechanism. It was a compelling
case who's hardest technical issue was dynamic character width determination
in the bottom-half of the tty subsystem.

I mention this to contrast it with substition of UTF8 (or any fixed-width
multi-octet encoding scheme) dependence for ASCII dependence, or the common
form of an addition of an "alternate code path" which affords run-time
selection of one of two code-set dependent processing mechanisms.

>From my perspective, the IETF has preferred the second form of solution to
the problem since the appearence of rfc2130. See also the following rfcs:
0373, 1345, 1468, 1489, 1502, 1555, 1557, 1815, 1842, 1922,
1947, 2237, and 2319.

As I pointed out to you over lunch Thursday at the W3C AC meeting, the i18n
problem is not simplified by the constraint which requires reference to
iso639, or iso3166. While few APRAnauts have an evident interest in the
problem of Euro-American Americanist hobbiests getting the fundamentals of
Cherokee wrong (or care that there are three Cherokee polities), in an ISO
normative reference (iso10646), on other lists (ICANN cluttered) Americans
of sundry "liberties" pursuasions are quite worked up that Euro-American
Sinology hobbiests are not, or may not, have precedence over Chinese
governmental and cultural institutions on the operational deployment of
Chinese language elements in the DNS (CNNIC vs Verisign).

A related question is whether the i18n problem is simplified by a constraint
which requires reference to the IAB Technical Comment on the Unique DNS Root,
a constraint which adds, without reflection, the constraints of iso3166 to
the dns-i18n problem set. Again, from my perspective, several sets of critics
of the IANA transition(s), and its reluctant proponents, have overloaded the
dns-i18n problem set as either an escape mechanism from uniqueness of the
DNS root, or as a problem which cannot be solved except by preservation of
the same property (uniqueness).

Neither party appear to be motivated by the interests of users of ASCII
colliding or pre-iso10646 (et alia) encodings, or users without practicable
means to use their preferred writing (or signing) systems.

Assuming a heterogenity of end-systems, each possibly with a heterogenous
set of character encoded applications with some cut-buffer mediation
mechanism, e.g., a (encoding-neutral or encoding-preferential) windowing
system for transparent, or converted reads and write operations between
end-system resident applications, and a DNS resolver library with access
DNS service, and no additional constraints (these are enough, thanks!),
is UTF-8 _the_ compelling answer?

The attractions of Universalism still appear to be compelling, only if some
non-technical, or ancilliary service model is controlling. Unfortunately,
the utility of Particularism is temporarily hijacked anywhere near the DNS
by partizans of one convention or its converse.

If next-hop has a case for forwarding, then it is surprising that the case
can't be applied to forwarding, except for opaque datagrams.

Cheers,
Eric

P.S. I forgot to work in NATs and VPNs. Sigh.




Re: Will Language Wars Balkanize the Web?

2000-12-05 Thread John Stracke

"Martin J. Duerst" wrote:

> The mixed case is not too
> important for us, as discussed above.

I think it can be, actually.  Suppose you've got someone living in Spain,
whose father is Spanish and whose mother is Japanese.  His full surname, then,
is something like Ohta y Montoya (or maybe the other way around; I don't
remember).  Now he wants to get a vanity domain, with "Ohta" in Japanese
characters and "y Montoya" in Roman letters.  He needs to be able to mix
character sets.

--
/===\
|John Stracke| http://www.ecal.com |My opinions are my own. |
|Chief Scientist |==|
|eCal Corp.  |"Fate just isn't what it used to be." --Hobbes|
|[EMAIL PROTECTED]|  |
\===/






Re: Will Language Wars Balkanize the Web?

2000-12-05 Thread Randy Bush

> Really big post offices have special places to handle things such
> as incomplete addresses. Nothing guaranteed, but if you are lucky,
> you may even successfully send a letter from an arbitrary place to
> anywhere in the world using local addressing, at least if you don't
> forget the country name in the local script.

tagging, eh?




RE: Will Language Wars Balkanize the Web?

2000-12-05 Thread RJ Atkinson

At 02:53 05/12/00, Martin J. Duerst wrote:
>At 00/12/04 10:42 -0800, Christian Huitema wrote:
>>So, at a minimum, we need an IETF
>>specification on how to detect that a domain name part is using a non ascii
>>encoding, so that DNS servers don't get lost.
>
>Why not just use UTF-8? It is an encoding of the UCS (aka
>Unicode/ISO 10646), the encoding is fully compatible with
>ASCII (all 7-bit bytes are ASCII and only ASCII), and it
>is IETF policy (RFC 2277).

All,

Please MOVE this conversation to the IDN WG list,
where it would be in scope.  Btw, this specific question
has been raised and answered several times now on the IDN list.
I encourage folks to read the sundry IDN proposals before
diving in any deeper here.

Thanks,

Ran




Re: Will Language Wars Balkanize the Web?

2000-12-05 Thread vint cerf

however the value of the public Internet is surely in its widespread
accessibility and interoperability.

vint

At 05:10 PM 12/5/2000 +0900, Martin J. Duerst wrote:
>I think there is a difference between making it technically possible
>for everybody to participate in whatever community they want, and
>forcing anybody to do so. Internet technology has shown that it's
>quite usable in local circumstances (the best example in Intranets).




Re: Will Language Wars Balkanize the Web?

2000-12-05 Thread Martin J. Duerst

At 00/12/04 08:15 -0500, Dave Crocker wrote:

>Thank you.  I was hoping someone would point out the support for parallel 
>operation so we could go further down that path.  As you note, it seems to 
>be the closest to providing local/global support already.
>
>That means postal gives us:
>
>1. Global support for a common "character set"
>
>2. Global support for a carefully mixed character set -- though really it 
>is just a partitioning between the global field and the local field
>
>3. Local support for a local character set.
>
>(the support goes beyond character set, but let's leave it at that if 
>that's ok.)
>
>An immediate problem with comparing to postal is that it somewhat 
>correlates with the path a letter will take, so that the incremental 
>interpretation can be done by groups with different language skill-sets.

Really big post offices have special places to handle things such
as incomplete addresses. Nothing guaranteed, but if you are lucky,
you may even successfully send a letter from an arbitrary place to
anywhere in the world using local addressing, at least if you don't
forget the country name in the local script.


>The DNS does not have that flexibility and the domain name interpretation 
>is not part of the transfer sequence of the data.

Yes, there are quite some differences. The advantage we have is
that as soon as the characters are somehow in the computer,
everything else is mechanical. This means there is no need
for a global field; if somebody is able to type in the address,
that's it, the machine does the rest.


>Schemes that put an ACE-like field into a .com might be considered to be 
>like #2, above, by really they are not.  The whole string is still global.

ACE is (maybe) for machines. It's not primarily intended for humans.
We may have ACE all the way (including TLD). It might be usable as a
poor man's ASCII equivalent, but I strongly doubt that anybody will
want to have it on the Latin side of their name card.


>Frankly this leaves me viewing the postal example as pretty unhelpful for 
>finding a solution to the DNS requirement.

Well, the postal example shows how Latin and other scripts can
both be used to address something. The mixed case is not too
important for us, as discussed above.

In the postal example, conversion from one notation to the other
is a complex process (in particular for Japanese, lookup in context
is absolutely necessary). So I don't expect that something purely
mechanical (e.g. ACE) will do for DNS.


>On the other hand, this thread was triggered by Graham's question about 
>the negative impact of partitioning.  The postal example would seem to 
>show that the effect is not so bad.




>Except I would claim that it is not partitioning.  Note that an address 
>always has a global representation, in addition to a possibly different 
>local one.

It's a kind of partitioning, in that it is not always easy,
for everybody, to do use the 'local' address or to convert
from a local to a global one.


>Perhaps that can reconciled as easily as claiming that any 'local' domain 
>name must also have a global form? (But, somehow, the word "scaling" gets 
>in the way of believing that.)

Scaling would be only by a factor 2.


Regards,   Martin.




Re: Will Language Wars Balkanize the Web?

2000-12-05 Thread Martin J. Duerst

At 00/12/04 19:58 -0500, Eric Brunner wrote:
> > I guess one of the first questions should be;  "Is some partitioning of 
> the
> > Internet community such a bad thing?"...
>
>If the "partition" intended for discussion is "@sign vs !path" addressing
>conventions, Eric Allman and Peter Honeyman have left a discussion archive
>on the subject.

Any pointers?

>Arguably the universalist thesis understated the drawbacks
>of anyone having the capability of addressing everyone anywhere. Clueless
>users is only one possible policy model -- a point made by Peter then, and
>equally valid now.
>
>Personally I'm underwhelmed by the universalism advocated by the members
>of the UNICODE Consortium, a single encoding scheme of necessity comes to
>peripheral markets late in their adoption of computerized writing systems,
>and their integration into a rationalized global system is not obviously a
>boon to their pre-integration service models.

Unicode came late to everybody's adoption of computerization of writing.
Most probably the delay is much longer for central markets than for
peripheral markets, but that would have to be checked.

Also, one main factor in the delay in many cases is the amount of time
it takes for the specific 'market' to agree on a single encoding scheme,
or encoding table, locally. In some cases (e.g. Korean), this is due to
the wide range of choices that the script offers for encoding. In other
cases, this is due to the fact that it takes some time (up to one
generation) for all the people who have proposed and implemented
different encodings not only to realize that everybody would benefit
from a single encoding, but also to accept that to a large extent,
which single encoding is chosen is by way less important than that
a single one is chosen.


>On the up-side, large user bases need not adapt to extraneous requirements
>for participating in the "Internet community", and Universalist Credos may
>fail in the markets (plural intended).

I think there is a difference between making it technically possible
for everybody to participate in whatever community they want, and
forcing anybody to do so. Internet technology has shown that it's
quite usable in local circumstances (the best example in Intranets).

Regards,   Martin.




RE: Will Language Wars Balkanize the Web?

2000-12-05 Thread Martin J. Duerst

At 00/12/04 10:42 -0800, Christian Huitema wrote:
>So, at a minimum, we need an IETF
>specification on how to detect that a domain name part is using a non ascii
>encoding, so that DNS servers don't get lost.

Why not just use UTF-8? It is an encoding of the UCS (aka
Unicode/ISO 10646), the encoding is fully compatible with
ASCII (all 7-bit bytes are ASCII and only ASCII), and it
is IETF policy (RFC 2277).

Regards,   Martin.





Re: Will Language Wars Balkanize the Web?

2000-12-04 Thread Eric Brunner

> I guess one of the first questions should be;  "Is some partitioning of the 
> Internet community such a bad thing?"...

If the "partition" intended for discussion is "@sign vs !path" addressing
conventions, Eric Allman and Peter Honeyman have left a discussion archive
on the subject. Arguably the universalist thesis understated the drawbacks
of anyone having the capability of addressing everyone anywhere. Clueless
users is only one possible policy model -- a point made by Peter then, and
equally valid now.

Personally I'm underwhelmed by the universalism advocated by the members
of the UNICODE Consortium, a single encoding scheme of necessity comes to
peripheral markets late in their adoption of computerized writing systems,
and their integration into a rationalized global system is not obviously a
boon to their pre-integration service models.

> PS:  I think it is without doubt that it is a Good Thing that we make 
> efforts to internationalize protocols ...

Even less satisfactory is the practice of generalizing ASCII (nee BCD) to
encodings with more than 256 code points, via this universalist scheme and
no other. To advance from ASCII to ASCII-plus-UTF8 could be just as well
characterized as SJIS/GB/Big5/... (and their uses) depricated.

>   ... my comments/questions are an 
> attempt to explore how far this process can reasonable go.

The i18n problem isn't trivial, and isn't advanced by  problematic essays,
good intentions, or American (actual and honorary) indulgences.

On the up-side, large user bases need not adapt to extraneous requirements
for participating in the "Internet community", and Universalist Credos may
fail in the markets (plural intended).

As for poking the ICANN mess in the eye with a sharpened brush on the IETF
list prior to a meeting, it is clumsy slight-of-hand and a poor substitute
for work on writing system support.  See also the W3C WAI for information
encoding and presentation systems which are not "writing".

Kitakitamatsinopowaw,
Eric




Re: Will Language Wars Balkanize the Web?

2000-12-04 Thread Keith Moore

> So, at a minimum, we need an IETF
> specification on how to detect that a domain name part is using a non ascii
> encoding, so that DNS servers don't get lost.

We need a great deal more than that.

The real impact of internationalizing DNS names isn't with the DNS 
protocol or software itself (you can probably do it without any changes 
to these), it is the applications that make assumptions about character 
encodings used in DNS names and/or place their own limitations on the 
allowable characters in DNS names.  

Keith




RE: Will Language Wars Balkanize the Web?

2000-12-04 Thread Christian Huitema

> On Sun, 03 Dec 2000 13:17:45 EST, vint cerf <[EMAIL PROTECTED]>  said:
> > to incorporate and refer to domain names. The IA4 alphabet 
> includes essentially
> > just the letters A-Z, numbers 0-9 and the "-" (dash). This 
> is the limit of what
> > is allowed in domain names today. 
> 
> The sad part is, of course, that RFC1035, section 3.1 
> specifically says
> that any octet value is legal.

The restrictions that Vint mentions are actually restrictions on the domain
name part of email addresses, as specified in RFC-821. The DNS system itself
does not has such restrictions; this allows for example RFC 2782 to specify
the use of the "illegal" character _ (underline) in some domain name parts.
The main restriction in the DNS itself is the comparison rule embedded in
the system, that says that domain names are case independent. Case
comparison is indeed specific to the alphabet code, and in fact is often
times language dependent. The matter is already muddy for European
languages. In a case independent comparison in French, e-acute matches the
accentless e; in German, u-umlaut could match the digraph "ue"; DNS servers
don't do such matches, but at least they do the binary comparison right when
an 8-bit alphabet is a superset of ASCII. But the matter indeeds gets more
complex when the characters are encoded on 16 bits, when either the top or
the bottom could be misinterpreted as a lower or upper case ascii letter,
resulting in incorrect matches. So, at a minimum, we need an IETF
specification on how to detect that a domain name part is using a non ascii
encoding, so that DNS servers don't get lost.

-- Christian Huitema




Re: Will Language Wars Balkanize the Web?

2000-12-04 Thread Robert G. Ferrell

>Wasn't there a Dilbert cartoon regarding sending a page to a pager number
>containing a caret? ;)

It was a tilde.

;-)

RGF

Robert G. Ferrell, CISSP
Information Systems Security Officer
National Business Center
U. S. Dept. of the Interior
[EMAIL PROTECTED]

 Who goeth without humor goeth unarmed.





Re: Will Language Wars Balkanize the Web?

2000-12-04 Thread Valdis . Kletnieks

On Sun, 03 Dec 2000 13:17:45 EST, vint cerf <[EMAIL PROTECTED]>  said:
> to incorporate and refer to domain names. The IA4 alphabet includes essentially
> just the letters A-Z, numbers 0-9 and the "-" (dash). This is the limit of what
> is allowed in domain names today. 

The sad part is, of course, that RFC1035, section 3.1 specifically says
that any octet value is legal.

But I guess we're stuck with the IA4 charset ;(


-- 
Valdis Kletnieks
Operating Systems Analyst
Virginia Tech


 PGP signature


Re: Will Language Wars Balkanize the Web?

2000-12-04 Thread Valdis . Kletnieks

On Sun, 03 Dec 2000 16:00:53 PST, lists <[EMAIL PROTECTED]>  said:
> "I'm sorry, I'm not going to be able to figure out how to type that email
> address on my keyboard, could you please send me a message, and I'll just hit
> reply".

Wasn't there a Dilbert cartoon regarding sending a page to a pager number
containing a caret? ;)
-- 
Valdis Kletnieks
Operating Systems Analyst
Virginia Tech



 PGP signature


Re: Will Language Wars Balkanize the Web?

2000-12-04 Thread Dave Crocker

At 10:59 PM 12/4/00 +0859, Masataka Ohta wrote:
> > Thank you.  I was hoping someone would point out the support for parallel
> > operation so we could go further down that path.  As you note, it seems to
> > be the closest to providing local/global support already.
>
>Silly comparison.

Thank you.  We always seek to entertain.


>Efficient postal system works with numbers so called zip code.

Zip/postal code is not required.

The example given was of country string in 'global' form and remainder in 
local.


>Postal address with various characters needs human intervention for
>complex matching and is similar not to DNS but to search engines.

Machines frequently process the strings, but that is not relevant to the 
nature and use of the strings.

They are addresses, pertaining to location.  Search engines are for 
free-form keywords.  Not the same at all.

d/


=-=-=-=-=
Dave Crocker  <[EMAIL PROTECTED]>
Brandenburg Consulting  
Tel: +1.408.246.8253,  Fax: +1.408.273.6464




Re: Will Language Wars Balkanize the Web?

2000-12-04 Thread Masataka Ohta

Dave;

> Thank you.  I was hoping someone would point out the support for parallel 
> operation so we could go further down that path.  As you note, it seems to 
> be the closest to providing local/global support already.

Silly comparison.

Efficient postal system works with numbers so called zip code.

Postal address with various characters needs human intervention for
complex matching and is similar not to DNS but to search engines.

Masataka Ohta




Re: Will Language Wars Balkanize the Web?

2000-12-04 Thread Dave Crocker


Thank you.  I was hoping someone would point out the support for parallel 
operation so we could go further down that path.  As you note, it seems to 
be the closest to providing local/global support already.

That means postal gives us:

1. Global support for a common "character set"

2. Global support for a carefully mixed character set -- though really it 
is just a partitioning between the global field and the local field

3. Local support for a local character set.

(the support goes beyond character set, but let's leave it at that if 
that's ok.)

An immediate problem with comparing to postal is that it somewhat 
correlates with the path a letter will take, so that the incremental 
interpretation can be done by groups with different language 
skill-sets.  The DNS does not have that flexibility and the domain name 
interpretation is not part of the transfer sequence of the data.

Schemes that put an ACE-like field into a .com might be considered to be 
like #2, above, by really they are not.  The whole string is still global.

Frankly this leaves me viewing the postal example as pretty unhelpful for 
finding a solution to the DNS requirement.

On the other hand, this thread was triggered by Graham's question about the 
negative impact of partitioning.  The postal example would seem to show 
that the effect is not so bad.

Except I would claim that it is not partitioning.  Note that an address 
always has a global representation, in addition to a possibly different 
local one.

Perhaps that can reconciled as easily as claiming that any 'local' domain 
name must also have a global form? (But, somehow, the word "scaling" gets 
in the way of believing that.)

d/

At 05:20 PM 12/4/00 +0900, Martin J. Duerst wrote:
>At 00/12/03 13:57 -0500, Dave Crocker wrote:
>>Would it be such a bad thing to be unable to postal mail a letter or 
>>package to anywhere in the world?
>
>Of course it would be very bad. But it is usual now to send mail
>e.g. from Japan to Japan with an address without any Latin letters.
>It is also possible to send mail e.g. from the US or Europe to e.g.
>Japan, with all but the country name in ideographs.
>
>So the postal system is already now much closer to multilingual
>domain names than to ASCII-only domain names.
>
>It is also possible, as far as I understand, to send mail
>with an address only written in Latin letters, to any country
>in the world. The multilingual domain name solution should of
>course provide a way (at least one way) to do this.

=-=-=-=-=
Dave Crocker  <[EMAIL PROTECTED]>
Brandenburg Consulting  
Tel: +1.408.246.8253,  Fax: +1.408.273.6464




Re: Will Language Wars Balkanize the Web?

2000-12-04 Thread Martin J. Duerst

At 00/12/03 13:57 -0500, Dave Crocker wrote:
>Would it be such a bad thing to be unable to postal mail a letter or 
>package to anywhere in the world?

Of course it would be very bad. But it is usual now to send mail
e.g. from Japan to Japan with an address without any Latin letters.
It is also possible to send mail e.g. from the US or Europe to e.g.
Japan, with all but the country name in ideographs.

So the postal system is already now much closer to multilingual
domain names than to ASCII-only domain names.

It is also possible, as far as I understand, to send mail
with an address only written in Latin letters, to any country
in the world. The multilingual domain name solution should of
course provide a way (at least one way) to do this.

Please also note that Japanese name cards usually have two sides,
one in Japanese and one in Latin. Now, the email addresses on
both sides are the same, but in the future, you would just
use the one on the Latin side if you cannot type Japanese.


Regards,   Martin.




Re: Will Language Wars Balkanize the Web?

2000-12-04 Thread Martin J. Duerst

At 00/12/03 08:03 +, Graham Klyne wrote:
>There's a news story at:
>
>   http://www.acm.org/technews/articles/2000-2/1201f.html#item10
>
>under the heading "Will Language Wars Balkanize the Web?"
>
>Leaving aside the issues of competing registries,

Sorry, but I think that's the main topic of the article (as far
as I can deduce from the abstract), and it is also the main
threat to create balkanization. The problem currently is not
that Chinese domain names may create a disconnect between
the "Chinese Internet" and some other part of the Internet,
but that there are various proposals and actors that are
working on Chinese domain names, and that all of them act
prematurely (i.e. before there is an IETF spec) and with
side interests that affect things negatively.


>touched upon in that article, I had been wondering with the formation of 
>IDN WG how I18N would affect cross-character-type-boundary Internet activities.
>
>I guess one of the first questions should be;  "Is some partitioning of 
>the Internet community such a bad thing?".  Why should it matter if, say, 
>Chinese-based domains aimed at Chinese audiences are not meaningfully 
>accessible to non-Chinese Internet users?

Reasonable question indeed. If the content is Chinese, does it hurt if the
address is also Chinese? There are cases where it indeed hurts (such as when
you have fonts to display Chinese on your system, but nothing to input
Chinese, as may be the case if you work off an English OS of some kind).
However, in general and for the majority of actual users (i.e. for
the Chinese users reading Chinese web pages,...), having Chinese
domain names is actually a big advantage. They are easier to
memorize, easier to guess, easier to identify with, and so on.


>At a purely technological level, the priority ascribed to the end-to-end 
>architecture of the Internet has underpinned and presumed 
>non-discriminatory any-to-any communication.  I wonder if this is a 
>reasonable expectation at the social level of Internet use.

At the *linguistic* level, there are certain rather hard boundaries
based on the difficulty of learning foreign languages and on the
slow advances of machine translation. At the social level, boundaries
should be kept as low as possible.

Regards,   Martin.




Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread Kimon A. Andreou


- Original Message -
From: "lists" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Sunday, December 03, 2000 19:00
Subject: Re: Will Language Wars Balkanize the Web?


>
>
> "I'm sorry, I'm not going to be able to figure out how to type that email
> address on my keyboard, could you please send me a message, and I'll just
hit
> reply".
>
> Adi
>


Good point.

I didn't think about e-mail addresses.


Kimon




_NetZero Free Internet Access and Email__
   http://www.netzero.net/download/index.html




Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread Randy Bush

> "I'm sorry, I'm not going to be able to figure out how to type that email
> address on my keyboard, could you please send me a message, and I'll just hit
> reply".

if the app-presentation -> internal coding -> dns request mapping is not
one:one and reversable on the other end, even this is not sure to work.

randy




Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread lists



On Sun, Dec 03, 2000 at 04:56:38PM -0500, Kimon A. Andreou wrote:
> 
> > You can't address a letter to someone in Berkeley, USA in nagari or
> amharic
> > characters and expect it to reach. However you can address a letter to
> someone
> > in Addis Ababa, Ethiopia in ASCII characters with a poor-phonetic
> > approximation and expect it to reach (choice of locales based on
> experience).
> >
> 
> >
> > Adi
> 
> But don't packets get routed using IP addresses  (i.e. numbers) ?

er, wrong layer. Although I'm as good at remembering IP addresses as phone
numbers, you'll have a hard time convincing others to give up DNS.

"I'm sorry, I'm not going to be able to figure out how to type that email
address on my keyboard, could you please send me a message, and I'll just hit
reply".

Adi




Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread Dave Crocker

Kimon gets a A.  Betsy gets an F.

d/

At 03:30 PM 12/3/00 -0500, Kimon A. Andreou wrote:
>But isn't the Internet a medium of communication as is the Post and the
>telephone?
>Therefore, shouldn't it support communication between any two points,
>wherever they may be or however they're called?
>
>Kimon
>- Original Message -
>From: "Betsy Brennan" <[EMAIL PROTECTED]>
> > But the Internet is not the postal system nor the phone system. We already
> > have the postal system and the phone system.  T

=-=-=-=-=
Dave Crocker  <[EMAIL PROTECTED]>
Brandenburg Consulting  
Tel: +1.408.246.8253,  Fax: +1.408.273.6464




Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread Kimon A. Andreou


- Original Message -
From: "R . P . Aditya" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Sunday, December 03, 2000 16:20
Subject: Re: Will Language Wars Balkanize the Web?



> You can't address a letter to someone in Berkeley, USA in nagari or
amharic
> characters and expect it to reach. However you can address a letter to
someone
> in Addis Ababa, Ethiopia in ASCII characters with a poor-phonetic
> approximation and expect it to reach (choice of locales based on
experience).
>

>
> Adi

But don't packets get routed using IP addresses  (i.e. numbers) ?

Kimon

___
Why pay for something you could get for free?
NetZero provides FREE Internet Access and Email
http://www.netzero.net/download/index.html





Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread R . P . Aditya

As has been noted, the _hard part_ is making the protocol that is used between
countries' communications systems "language independent".

> > Would it be such a bad thing to be unable to make a phone call to anywhere
> > in the world?

I have yet to see a telephone dialpad that even has non-arabic base-10 numbers
on it (has it slowed the spread and use of the phone system?).

> > Would it be such a bad thing to be unable to postal mail a letter or
> > package to anywhere in the world?

You can't address a letter to someone in Berkeley, USA in nagari or amharic
characters and expect it to reach. However you can address a letter to someone
in Addis Ababa, Ethiopia in ASCII characters with a poor-phonetic
approximation and expect it to reach (choice of locales based on experience).

At some point it's not worth the effort to "internationalize" all the
layers...will the lucrative returns on additional domains pay for such an
effort? and will that make an already "complex" Internet more accessible?

Does Babelization without language isomorphism lead to Balkanization? Or, "why
is machine translation so hard?".

Adi

On Sun, Dec 03, 2000 at 03:06:10PM -0500, Betsy Brennan wrote:
> But the Internet is not the postal system nor the phone system. We already
> have the postal system and the phone system.  They may be slower, but does
> that mean they should be replaced or that the Internet must duplicate what
> these systems do? BLB
> 
> Dave Crocker wrote:
> 
> > At 08:03 AM 12/3/00 +, Graham Klyne wrote:
> > >I guess one of the first questions should be;  "Is some partitioning of
> > >the Internet community such a bad thing?".
> >
> > Would it be such a bad thing to be unable to make a phone call to anywhere
> > in the world?
> >
> > Would it be such a bad thing to be unable to postal mail a letter or
> > package to anywhere in the world?
> >
> > d/
> >
> > ps.  strictly rhetorical questions, as I hope is obvious.
> >
> > =-=-=-=-=
> > Dave Crocker  <[EMAIL PROTECTED]>
> > Brandenburg Consulting  
> > Tel: +1.408.246.8253,  Fax: +1.408.273.6464
> 




Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread Kimon A. Andreou

But isn't the Internet a medium of communication as is the Post and the
telephone?
Therefore, shouldn't it support communication between any two points,
wherever they may be or however they're called?

Kimon


- Original Message -
From: "Betsy Brennan" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Sunday, December 03, 2000 15:06
Subject: Re: Will Language Wars Balkanize the Web?


> But the Internet is not the postal system nor the phone system. We already
> have the postal system and the phone system.  They may be slower, but does
> that mean they should be replaced or that the Internet must duplicate what
> these systems do? BLB
>



NetZero Free Internet Access and Email_
Download Now http://www.netzero.net/download/index.html
Request a CDROM  1-800-333-3633
___




Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread Randy Bush

> But the Internet is not the postal system nor the phone system. We already
> have the postal system and the phone system.  They may be slower, but does
> that mean they should be replaced or that the Internet must duplicate what
> these systems do?

i am sorry, but i can not understand the above.  perhaps you were writing in
californian.  qed.

randy




Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread Betsy Brennan

But the Internet is not the postal system nor the phone system. We already
have the postal system and the phone system.  They may be slower, but does
that mean they should be replaced or that the Internet must duplicate what
these systems do? BLB

Dave Crocker wrote:

> At 08:03 AM 12/3/00 +, Graham Klyne wrote:
> >I guess one of the first questions should be;  "Is some partitioning of
> >the Internet community such a bad thing?".
>
> Would it be such a bad thing to be unable to make a phone call to anywhere
> in the world?
>
> Would it be such a bad thing to be unable to postal mail a letter or
> package to anywhere in the world?
>
> d/
>
> ps.  strictly rhetorical questions, as I hope is obvious.
>
> =-=-=-=-=
> Dave Crocker  <[EMAIL PROTECTED]>
> Brandenburg Consulting  
> Tel: +1.408.246.8253,  Fax: +1.408.273.6464




Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread Karl Auerbach


> I guess one of the first questions should be;  "Is some partitioning of the 
> Internet community such a bad thing?".  Why should it matter if, say, 
> Chinese-based domains aimed at Chinese audiences are not meaningfully 
> accessible to non-Chinese Internet users?

There's a distinct issue that exists apart from the inter-human aspects -
the packets containing these new character forms will flow, at least
occasionally, into pretty much everyone's machines, routers, NATs,
firewalls, web caches, etc - all of which need to be able to handle these
new packets without ill effects.  (The definition of "ill effect" will
vary depending on what the box is supposed to be doing.)

For instance, it would be "a bad thing" if some "transparent" web cache in
some ISP went south when it re-resolved a URL that contained a domain name
that either had itself a label in some non-hostname character set or was
resolved via a CNAME containing non-hostname characters.

In other words, although the humans (and their user interfaces) may
Balkanize, the infrastructure on which the net operates should not.

--karl--








Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread vint cerf

In my opinion, it is vital to craft Internet's evolution so as to maintain
full connectivity and interworking among all its parts. I do not see
"balkanization" as a good thing at all. I believe there are sound technical
means to achieve the objective of incorporating character sets associated
with non-roman languages but that critics need to understand more fully just
how important the limitations of the current character set for domain names
have been in maintaining interworking and also ability of so many applications
to incorporate and refer to domain names. The IA4 alphabet includes essentially
just the letters A-Z, numbers 0-9 and the "-" (dash). This is the limit of what
is allowed in domain names today. 

Incorporating other character sets without deep technical consideration will
risk the inestimable value of interworking across the Internet. It CAN be done
but there is a great deal of work to make it function properly.

Vint

At 08:03 AM 12/3/2000 +, Graham Klyne wrote:
>There's a news story at:
>
>  http://www.acm.org/technews/articles/2000-2/1201f.html#item10
>
>under the heading "Will Language Wars Balkanize the Web?"
>
>Leaving aside the issues of competing registries, touched upon in that article, I had 
>been wondering with the formation of IDN WG how I18N would affect 
>cross-character-type-boundary Internet activities.
>
>I guess one of the first questions should be;  "Is some partitioning of the Internet 
>community such a bad thing?".  Why should it matter if, say, Chinese-based domains 
>aimed at Chinese audiences are not meaningfully accessible to non-Chinese Internet 
>users?  At a purely technological level, the priority ascribed to the end-to-end 
>architecture of the Internet has underpinned and presumed non-discriminatory 
>any-to-any communication.  I wonder if this is a reasonable expectation at the social 
>level of Internet use.
>
>#g
>
>PS:  I think it is without doubt that it is a Good Thing that we make efforts to 
>internationalize protocols;  my comments/questions are an attempt to explore how far 
>this process can reasonable go.
>
>
>Graham Klyne
>([EMAIL PROTECTED])




Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread Dave Crocker

At 08:03 AM 12/3/00 +, Graham Klyne wrote:
>I guess one of the first questions should be;  "Is some partitioning of 
>the Internet community such a bad thing?".

Would it be such a bad thing to be unable to make a phone call to anywhere 
in the world?

Would it be such a bad thing to be unable to postal mail a letter or 
package to anywhere in the world?

d/

ps.  strictly rhetorical questions, as I hope is obvious.


=-=-=-=-=
Dave Crocker  <[EMAIL PROTECTED]>
Brandenburg Consulting  
Tel: +1.408.246.8253,  Fax: +1.408.273.6464




Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread RJ Atkinson

At 03:03 03/12/00, Graham Klyne wrote:
>I guess one of the first questions should be;  "Is some partitioning of the Internet 
>community such a bad thing?"

A partioning based on nationality, which is of course
different than language group, would be harmful.  Lack of
interoperability of standard protocols would be bad, for
whatever reason, including incompatible localisations.  Lack
of standards support for internationalisation/multi-lingual
computing, as different from localisation, would also be bad.

>  Why should it matter if, say, Chinese-based domains aimed 
>at Chinese audiences are not meaningfully accessible to 
>non-Chinese Internet users?  

What about people who can read and perhaps also write
in Chinese characters but who are not Chinese (either ROC
on Taiwan or PRC on the mainland) nationals ?  Consider
not only folks in Singapore or SE Asia generally, but also 
Chinese-capable folks in other places (e.g. North America, 
Europe).  [NB: I'm deliberately ignoring the issues with 
Traditional vs Simplified characters just now, though that
is also part of the internationalisation equation].  

I regularly read my news from British or Hong Kong
or other countries' web sites.  Living in North America,
I'm certainly not the target audience for the HK Standard
or South China Morning Post.  However, I do read those 
newspapers online.  Less regularly, but occasionally,
I do read Chinese web sites (in Chinese) or Japanese web
sites (reading the Kanji portion only).  I am most assuredly 
NOT the target audience for any of these web sites.

On a daily basis, I receive mail with Chinese language
contents, though a surprising amount of that turns out to
be unsolicted bulk email in my own case.  I receive a modest
amount of German or Vietnamese email.  So multi-lingual protocol
capabilities are quite important to me.

So for all those reasons, it does in fact matter
a great deal.

>At a purely technological level, the priority ascribed to the end-to-end architecture 
>of the Internet has underpinned and presumed non-discriminatory any-to-any 
>communication.  I wonder if this is a reasonable expectation at the social level of 
>Internet use.

I do think so.

>PS:  I think it is without doubt that it is a Good Thing that we make efforts to 
>internationalize protocols;  my comments/questions are an attempt to explore how far 
>this process can reasonable go.

I don't want to try to predict the future, so I won't.  
I can say that today, we are NOT anywhere close to a reasonable 
end point or stopping point for internationalisation of IETF 
standards-track protocols.  In particular, we haven't resolved
the basic internationalisation issues for a number of core 
infrastructure protocols (e.g. DNS).

Regards,

Ran
[EMAIL PROTECTED]




Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread Masataka Ohta

Graham;

> Leaving aside the issues of competing registries, touched upon in that 
> article, I had been wondering with the formation of IDN WG how I18N would 
> affect cross-character-type-boundary Internet activities.

Nothing.

Cross-character-type-boundary is a pure localization issue
and has nothing to do with people wrongly working on I18N.

> PS:  I think it is without doubt that it is a Good Thing that we make 
> efforts to internationalize protocols;

If only you understand what "internationalize protocols" mean.

ASCII (latin, numeric and hypen) characters are the only characters
internationally recognizable by so many people.

Masataka Ohta




Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread Randy Bush

you may want to look at the work going on in the idn wg.

randy




Will Language Wars Balkanize the Web?

2000-12-03 Thread Graham Klyne

There's a news story at:

   http://www.acm.org/technews/articles/2000-2/1201f.html#item10

under the heading "Will Language Wars Balkanize the Web?"

Leaving aside the issues of competing registries, touched upon in that 
article, I had been wondering with the formation of IDN WG how I18N would 
affect cross-character-type-boundary Internet activities.

I guess one of the first questions should be;  "Is some partitioning of the 
Internet community such a bad thing?".  Why should it matter if, say, 
Chinese-based domains aimed at Chinese audiences are not meaningfully 
accessible to non-Chinese Internet users?  At a purely technological level, 
the priority ascribed to the end-to-end architecture of the Internet has 
underpinned and presumed non-discriminatory any-to-any communication.  I 
wonder if this is a reasonable expectation at the social level of Internet use.

#g

PS:  I think it is without doubt that it is a Good Thing that we make 
efforts to internationalize protocols;  my comments/questions are an 
attempt to explore how far this process can reasonable go.


Graham Klyne
([EMAIL PROTECTED])