Re: Will Language Wars Balkanize the Web?

2000-12-08 Thread Fred Baker

At 03:49 AM 12/8/00 +0859, Masataka Ohta wrote:
However, they can't justify to call them internationalization.

precisely.




Internationalization and the IETF (Re: Will Language Wars Balkanize the Web?)

2000-12-07 Thread Harald Alvestrand

At 15:35 06/12/2000 -0700, Vernon Schryver wrote:
The same thinking that says that MIME Version headers make sense in
general IETF list mail also says that localized alphabets and glyphs must
be used in absolutely all contexts, including those that everyone must
use and so would expect to be limited to the lowest common denominator.

it may have escaped the notice of some that a fair bit of the discussion on 
diacritcs was carried out using live examples, and while I am sure there 
were some who did not see the diacritics on screen, at least there was a 
single definition of how to get from what was sent on the wire to what 
might have been displayed on the screen, and MANY of the participants 
actually saw them correctly displayed.

MIME character sets is an example of a battle fought and won.

--
Harald Tveit Alvestrand, [EMAIL PROTECTED]
+47 41 44 29 94
Personal email: [EMAIL PROTECTED]




Re: Internationalization and the IETF (Re: Will Language Wars Balkanize the Web?)

2000-12-07 Thread Vernon Schryver

 From: Harald Alvestrand [EMAIL PROTECTED]

 The same thinking that says that MIME Version headers make sense in
 general IETF list mail also says that localized alphabets and glyphs must
 be used in absolutely all contexts, including those that everyone must
 use and so would expect to be limited to the lowest common denominator.

 it may have escaped the notice of some that a fair bit of the discussion on 
 diacritcs was carried out using live examples, and while I am sure there 
 were some who did not see the diacritics on screen, at least there was a 
 single definition of how to get from what was sent on the wire to what 
 might have been displayed on the screen, and MANY of the participants 
 actually saw them correctly displayed.

Diacritical marks are no different from Cyrillic, Arabic, Greek, Hebrew,
Sanskrit, and other non-Latin character sets in not being not part of
the international language.  The goal of communicating is to communicate,
not wave flags in support of national languages.  When you are trying
to talk to strangers and have no clue about their languages, you are a
fool to not use the common, international language, no matter how poor
and ugly it is.


 MIME character sets is an example of a battle fought and won.

When MIME is used to pass special forms among people whose common
understandings including more or other than ASCII, MIME is a battle
fought and won.

When MIME is used to send unintelligible garbage, it is a battle fought
and lost.  Whether the garbage is HTML, the latest word processing
format from Redmond or a good representation of the mother tongue of
1,000,000,000 people is irrelevant to whether the use of MIME is wise
or foolish.  If the encoding is not known before hand to be intelligible
to its recipients, then the use of MIME is foolish.

MIME is a good *localization* mechanism, either in geography or culture
or in computer applications (e.g. pictures or sound).

The continuing IETF efforts to extend MIME to include yet more extra or
special forms in the vague hope that the recipient will surely be able to
interpret at least one is probably the best of what we can expect from
"internationalized" domain names in 2 or 3 years.  Unless something like
Vint Cerf's principle of encoding *localized* domain names in ASCII is
followed, the IDN efforts will at best repeat the history of MIME email
exemplified by the many Microsoft MIME formats.

In MIME, except in special cases, the "universal" form of the body is
either sufficient and the fancy versions useless wastes of cycles, storage,
and bandwidth, or the "universal" form can only say "sorry, better upgrade
your system."  Just as in the vast majority of HTML+ASCII email where
there is can be no useful difference and there is rarely a visible
difference between the ASCII plaintext and the HTML encrypted version,
*localized* domain names will either be unusable outside their native
provinces or they will be usable with a 7-bit ASCII keyboard.


Vernon Schryver[EMAIL PROTECTED]




Re: Internationalization and the IETF (Re: Will Language Wars Balkanize the Web?)

2000-12-07 Thread John Stracke

Keith Moore wrote:

 Furthermore, a
 great many people use multiple languages (not necessarily including
 English) is, so that a given person, host, or subnetwork will often
 need to exist in multiple (potentially competing) locales at once.

Sometimes even in the same sentence.  My mother grew up partly in Quebec;
when she's talking to her siblings, they'll often use French words when the
English ones don't come to mind immediately.

--
/==\
|John Stracke| http://www.ecal.com |My opinions are my own.|
|Chief Scientist |=|
|eCal Corp.  |How many roads must a man walk down before he|
|[EMAIL PROTECTED]|admits he is LOST?   |
\==/






Re: Internationalization and the IETF (Re: Will Language Wars Balkanize the Web?)

2000-12-07 Thread Theodore Y. Ts'o

   Date: Thu, 07 Dec 2000 07:23:11 -0500
   From: Dave Crocker [EMAIL PROTECTED]

   At least the recipient has the unintelligible data well isolated and 
   labeled.  MIME did its job.

Indeed.  If I get a mail message which is in HTML only, 99.97% of the
time it's SPAM-mail.  And I've lost count of how many time I've received
Chinese (or other Asian language) SPAM-mail.  In fact, I'm seriously
thinking about coding up a rule which automatically junks HTML mail
unread. 

I guess MIME is useful for something.  :-)


- Ted




Re: Will Language Wars Balkanize the Web?

2000-12-07 Thread Keith Moore

  you missed it. Suppose you could not exchange in commerce with a person of
  a given nationality, not because you did not have a language in common with
  him or her, but because your system could not interpret his or her name.
  That would mean that you could not spend money in that person's direction,
  because you could not communicate with him or her.
 
 And it means that person is at a disadvantage in your marketspace, and
 that it's not your problem.

why in the world do people think they can justify or not justify actions
based on whether something is an advantage/disadvantage in some 
"marketspace"?

Keith




Re: Will Language Wars Balkanize the Web?

2000-12-07 Thread Masataka Ohta

Keith;

   you missed it. Suppose you could not exchange in commerce with a person of
   a given nationality, not because you did not have a language in common with
   him or her, but because your system could not interpret his or her name.
   That would mean that you could not spend money in that person's direction,
   because you could not communicate with him or her.
  
  And it means that person is at a disadvantage in your marketspace, and
  that it's not your problem.
 
 why in the world do people think they can justify or not justify actions
 based on whether something is an advantage/disadvantage in some 
 "marketspace"?

They can justify them locally within local marketspaces, of course.

However, they can't justify to call them internationalization.

Masataka Ohta




Re: Internationalization and the IETF (Re: Will Language Wars Balkanize the Web?)

2000-12-07 Thread Vernon Schryver

 From: Henk Langeveld [EMAIL PROTECTED]

 You know, it isn't that long ago that I realised that for many Americans,
 "International" is synonymous with "Non-American".

That is as true as the observation that many who learn English as a
second language think that "international" is synonymous with using
the language of their few dozen million countrymen.  

It is a fact that the single international language of the late 20th and
early 21st is far more closely related to a subset of American English
than any other local language.  It is also a fact that only during my
lifetime has that odd situation developed.  If the world had asked you or
me to design an international language, I think either of us would have
done better.  But the first fact is all that matters.

If it makes your feel better, note that just as Latin was not exactly what
Italians spoke, the current international language is not exactly what is
spoken by citizens of the largest nation that calls itself The United
States of America (there are 1) and whose mother tongue is English.
Thanks to satellite TV and other forms of what the P.C. call cultural
imperialism, the modern difference are small, but they exist.


 From: Dave Crocker [EMAIL PROTECTED]

 Diacritical marks are no different from Cyrillic, Arabic, Greek, Hebrew,
 Sanskrit, and other non-Latin character sets in not being not part of
 the international language.  The goal of communicating is to communicate,
 not wave flags in support of national languages.

 In a sense, Harald's observation points out a case in which all those other 
 sets very much ARE part of the "international" language.

If those are part of your "international language," then what characters
are not part of it?  It is Polically Correct to pretend we all speak,
read, and write a single language, but also hopelessly silly.


 It does not matter whether readers understood the semantics of the strings; 
 they needed to be able to see them.
  That is not national flag waving.
  That is global utility.

"Global unity" is a matter of everyone being able to communicate with
everyone else.  It has not only has nothing to do with each of us using
our favorite set of glyphs, but goes against it.  Each of us using our
favorite language *internationally* is a real Tower of Babel.

Being able use strings is not only a matter of being able to type their
characters.  Those of us who have studied languages with alphabets other
than what learned while young have discovered that just as the human ear
has difficulty hearing sounds outside our mother tongues, the human eye
has trouble seeing foreign glyphs.  If they're not yours, all of those
diacritical marks look the same or are invisible.

There are good reasons why the international lingua francas of previous
millenia have forced people to transliterate their native writings
instead of importing them wholesale.  MIME and 8-bit domain names are
mechanisms for importing wholesale instead of transliterating.  They're
good *locally*, but not *internationally*.

 ...
 Technical standards work often gets distracted by trying to deal with 
 issues that are outside the scope of reasonable technical standards 
 work.  It should not be the task of such work to dictate or constrain users 
 to only socially acceptable behavior.  That is a social task, not a 
 technical one.

Yes.  So why do otherwise rational IETF particpants claim that
social and political notions such as "global unity" are somehow
related to MIME and IDN?

MIME and localized domain names are good and necessary, but only
locally or provincially, even when "locally" involves vast land
areas (e.g. Russian or Spanish) or billions of people.

 Choosing to send various types of data requires making decisions about the 
 context.  No technical standard can be designed to "automatically" 
 determine when it is, or is not, appropriate to send that data, whether it 
 is diacritical marks, kanji, or an excel spread sheet.  Even when the 
 sender has information about recipient capabilities, social factors affect 
 the choices.

Yes, so why do some MIME and *localized* domain name advocates claim
otherwise?  What is the pathology insisting that sending MIME to
international mailing lists makes sense?  Why do apparently rational people
claim that 8-bit binary domain are "international"?  Because they've been
infected with Political Correctness or because they don't want to dilute
political support among the unthinking for whatever they're advocating?


 ...
 At least the recipient has the unintelligible data well isolated and 
 labeled.  MIME did its job.

Yes, but the justification of the sender for using MIME to send
unitllitible data is crazy, since communication is averted while
resources, including the human recipient's time, are wasted.


 ...
 The question is whether a coherent extension to DNS will be done in a 
 fashion which will keep the DNS integrated, or whether this requirement 
 produces an 

Re: Internationalization and the IETF (Re: Will Language Wars Balkanize the Web?)

2000-12-07 Thread Matt Crawford

  If the world had asked you or me to design an international
 language, I think either of us would have done better.

Don't be too sure.  Even today, there are no more speakers of
Esperanto than of Mayan.




end to end (Re: Will Language Wars Balkanize the Web?)

2000-12-07 Thread Dave Crocker

At 06:21 PM 12/6/00 +, Graham Klyne wrote:
BTW, the basic tenet of end-to-end connectivity of data and services is, I 
think, satisfied by the IP layer.  Part of my question was about the 
extent to which this end-to-end-ness needs to be duplicated at higher layers.

Not sure whether this is a distraction -- hence the modified Subject -- but 
I do NOT consider an end-to-end mechanism at one level to be sufficient, 
when talking about end-to-end at another level.

Lower layers must support the e2e requirements of the layer under 
discussion, but those lower layers do not satisfy the requirements by 
themselves.

If the layer under discussion, in this case the DNS application, does not 
support e2e, then the fact that IP does does not buy much.

d/


=-=-=-=-=
Dave Crocker  [EMAIL PROTECTED]
Brandenburg Consulting  www.brandenburg.com
Tel: +1.408.246.8253,  Fax: +1.408.273.6464




Re: Will Language Wars Balkanize the Web?

2000-12-06 Thread Masataka Ohta

Claus;

 vint cerf [EMAIL PROTECTED] schrieb/wrote:
  Incorporating other character sets without deep technical
  consideration will risk the inestimable value of interworking across
  the Internet. It CAN be done but there is a great deal of work to make
  it function properly.
 
 How do I type chinese characters? I can't. So I can't write mail to  
 someone whose email address contains non-ASCII characters if I don't  
 already have the address in electronic form (e.g. within a webpage).

Right.

And, if a mailto URL is within a webpage with a chinese character
anchor, it does not matter whether a mail address in the URL
consists of pure ASCII characters or not.

 It's worth nothing that my computer could handle the address if I can't.

You properly understand that the current ASCII DNS is already
fully internationalized.

Masataka Ohta




Re: Will Language Wars Balkanize the Web?

2000-12-06 Thread vint cerf

Mr. Ohta has put his finger on a key point: ability of all
parties to generate email addresses, web page URLs and so on.
Even if we introduce extended character sets, it seems vital
that there be some form of domain name that can be rendered
(and entered) as simple IA4 characters to assure continued
interworking at the most basic levels. This suggests that
there is need for some correspondence between an IA4 Domain
Name and any extended characterset counterpart.

Vint

At 07:32 PM 12/6/2000 +0859, you wrote:
And, if a mailto URL is within a webpage with a chinese character
anchor, it does not matter whether a mail address in the URL
consists of pure ASCII characters or not.

 It's worth nothing that my computer could handle the address if I can't.

You properly understand that the current ASCII DNS is already
fully internationalized.




RE: Will Language Wars Balkanize the Web?

2000-12-06 Thread Hongwei

I can't agree more.

-Original Message-
From: John C Klensin [mailto:[EMAIL PROTECTED]]
Sent: 06 December 2000 16:46
To: vint cerf
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: Will Language Wars Balkanize the Web?


(Can we please move this discussion to the IDN list, where it
belongs?)

--On Wednesday, 06 December, 2000 08:19 -0500 vint cerf
[EMAIL PROTECTED] wrote:

 Mr. Ohta has put his finger on a key point: ability of all
 parties to generate email addresses, web page URLs and so on.
 Even if we introduce extended character sets, it seems vital
 that there be some form of domain name that can be rendered
 (and entered) as simple IA4 characters to assure continued
 interworking at the most basic levels. This suggests that
 there is need for some correspondence between an IA4 Domain
 Name and any extended characterset counterpart.

Vint,

I think I agree with the principle.  However, there are several
different models with which the "correspondence" can be
implemented.  The difference among them is quite important
technically --implementations would need to occur in different
places and with different implications, deployment times, and
side effects--  and perhaps as important philosophically.  E.g.,
let me try to identify some of the in extreme form to help
identify the differences:

(i) The names in the DNS are "protocol elements".  They should
be expressed in a minimal subset of ASCII so that they can be
rendered and typed on almost all of the world's equipment (the
assumption that, e.g., all Chinese or Arabic keyboards and
display devices in the medium to long term will contain Roman
characters seems a little dubious).  There is no requirement
that they be mneumonic in any language: in principle, a string
containing characters selected at random would do as well as the
name of a company, person, or product.

This model gives rise to directory and keyword systems (most of
them outside the DNS) that contain the names that people use.
While the registration and name-conflict problems are
non-trivial, names in multiple languages and character codings
can easily map onto a single DNS identifier.  On the other hand,
binding a national-language name to an ASCII name would need to
be done either by parallel registrations or by matching on
keywords (and the latter might not yield unambiguous and
accurate results).

(ii) Entries in the DNS are always coded.  After all, "ASCII" is
just a code mapping between a human-visible character set and a
machine (or wire) representation.  It is the job of an
application to get from "characters" to "codes" and back, and to
recognize coding systems and applying the correct decodings.
And software that is old or broken will simply display a
different rendering of the coded form (whether that is a
"hexification" such as Base64 or some other system).  

This model gives rise to the "ACE all the way up" models, in
which non-ASCII names are placed in the DNS using some tagging
system, but the "ASCII representation" of a name that, in the
original, uses non-Roman characters, may be quite ugly and bear
no connection with the name as it would be rendered using the
original characters other than an algorithmic one.   It also
gives rise to some of the UTF-8 models, on the assumption that
applications that can't handle the full IS 10646 character set
can do something intelligent.

(iii) Regardless of how the names in the DNS are coded, it is
important to have analogies to "two sided business cards".  Such
systems assume that any name rendered in a non-Roman character
set should have an analogue in Roman characters.  And those
analogues are expected to be bound to the original form by
transliteration or translation -- they aren't just random,
algorithmically matching, strings.

While there need to be facilities for the non-Roman (even
non-ASCII) characters in either the DNS or a directory,
establishing the "ASCII names" is, of necessity, a registration
issue rather than an algorithmic issue.  We don't know how to do
the "translation" (or, in the general case, even
transliteration) algorithmically.   To give one example, despite
the "Han unification" of IS 10646, the characters on a Japanese
business card for you would almost certainly be different from
those on a Chinese business card for you.And, because of the
registration issue, there is no plausible way to impose a
requirement that every host (or other DNS entry) have a name in
ASCII if it has a name in some other script: people and hosts
not visible outside their own countries may not care enough to
go to the trouble.

These models are not mutually exclusive.  But they are
definitely different perspectives.

It is also worth noting that, as a matter of perspective, the
dominance of subsets of ASCII in these debates has some
important technical advantages (e.g., the code set can be m

Re: Will Language Wars Balkanize the Web?

2000-12-06 Thread Masataka Ohta

John;

 (Can we please move this discussion to the IDN list, where it
 belongs?)

The point is that IDN WG is purposeless and is wrong to exist. Of
course, it is waste of time to discuss it in IDN list. So, the
only reasonable reaction is to ignore it (I dropped improper CC:).

The only necessary discussion on domain names, IF ANY, is
localization issues, for which there is no specific WG of IETF.

 (iii) Regardless of how the names in the DNS are coded, it is
 important to have analogies to "two sided business cards".

A typical business card of Japanese have Chinese characters.

When we internatinalize it, we use the other side to put a Lain
character version.

As we already have fully internatinalized DNS with Latin
characters, Chinese characters in DNS is localization against
internationalization.

 And, because of the
 registration issue, there is no plausible way to impose a
 requirement that every host (or other DNS entry) have a name in
 ASCII if it has a name in some other script: people and hosts
 not visible outside their own countries may not care enough to
 go to the trouble.

That are local issues.

If people want local names let them have them under local domains,
with all the local conventions on encoding and everything.

The administrator of the local domains may or may not force people
have additional internatinalized domain names.

Note that local, here, means culturally (not necessarily geographically)
local that ccTLDs may or maynot be the local domains.

But, it can be said that gTLDs are not a proper place to put local
names.

Masataka Ohta




Re: Will Language Wars Balkanize the Web?

2000-12-06 Thread Vernon Schryver

 From: Masataka Ohta [EMAIL PROTECTED]

 ...
  (Can we please move this discussion to the IDN list, where it
  belongs?)

 The point is that IDN WG is purposeless and is wrong to exist. Of
 course, it is waste of time to discuss it in IDN list

Masataka Ohta is raising a point of order, and from what I've seen of
other "internationalization" efforts, it is probably more valid than
not.  That the IETF's effort nominally involves "internationalization"
instead of "localization" is bad sign.

Since I first encountered "internationalization" hassles in the late
1970's in making an ASCII+EBCDIC system behave tolerably for people
typing and reading Arabic and Hebrew text, I've found that
"internationalization" is both technically hard and incredibly Politically
Correct.  Some people like to hoist standardized flags that today bear
"Respect for Diversity" and start marching over cliffs--no, that's wrong.
In Politically Correct issues, the standards bearers tell everyone else
to march over the cliff while they stand to attention nearby.

Once an "internationalization" organization gets started, it *never*
stops, no matter how many of the original participants get wise
and quit, what obviously false premise is required to justify the
latest conclusion, nor what damage has already been done (not to
mention contemplated) in the product, standard, protocol, or whatever
justifies the existence of the internationalization organization.
"Is the new version equally and completley useless for both domestic
and overseas users?--Great, let's fix the next one."

It took me about 10 years and more than one "internationalization"
organization to reach that politically incorrect conclusion.


 ...
 If people want local names let them have them under local domains,
 with all the local conventions on encoding and everything.

 The administrator of the local domains may or may not force people
 have additional internatinalized domain names.

 Note that local, here, means culturally (not necessarily geographically)
 local that ccTLDs may or maynot be the local domains.

 But, it can be said that gTLDs are not a proper place to put local
 names.

The same thinking that says that MIME Version headers make sense in
general IETF list mail also says that localized alphabets and glyphs must
be used in absolutely all contexts, including those that everyone must
use and so would expect to be limited to the lowest common denominator.
When confronted with fact that ANSI X3.4 (ASCII) is a provincial U.S.
variant of an international standard, otherwise rational people flinch
and claim that sending anything but 7-bit ASCII to major IETF lists is
not merely an unthinking waste of bandwidth but must be supported and
encouraged.  They justify such nonsense with talk like:

]diversity of list
] contributors' networking interests and experience (culture), which include
] people who happen to find it cost-effective to use such things as
] formatting and unusual character sets in their email. MIME is as much a
] part of the Internet culture as any standard 

(apologies to the author of that private message)

It is a mystery to me why otherwise reasonable people who would never
dream of imposing their own idiosyncracies on everyone else demand that
others not only be allowed but encouraged to do so.

In other words, people have trouble understanding that
"internationalization" necessarily means restricting to the lowest
common internatational denominatior instead of the impossible goal of
simultaneously supporting absolutely all possible languages and glyphs.


Vernon Schryver[EMAIL PROTECTED]




Re: Will Language Wars Balkanize the Web? P.S. Eudora/PalmOS

2000-12-06 Thread James P. Salsman

Masataka Ohta and Vernon Schryver make excellent points in favor 
of the domain name status quo.  I agree that IDN should be frozen 
for at least a few years to see what local domain admins and 
application vendors tend to do, especially since the pieces of 
the likely solutions (such as the competing UTF-8 encodings) are 
so still so new and somewhat under development.

I don't know why ICANN would want to bring such a heavy burden
upon themselves in an area of such flux so soone, when they have 
so much else that they have already committed to do.

This thread reminded me of these news items, only two days apart:

http://abcnews.go.com/sections/travel/DailyNews/FrenchintheSkies000404.html

http://abcnews.go.com/sections/travel/DailyNews/BacktoFrenchinSkies000406.html

Cheers,
James

P.S.  By the way, on my usual topic of wireless asynchronous voice 
messaging, here is a news article in which Qualcomm founder and chief 
Irwin Jacobs asserts that "voice-enabled capabilities" "could prove 
popular" on third-generation mobile phones:

  http://biz.yahoo.com/rf/001206/hkg15073_2.html

I suppose Irwin Jacobs is the person to ask for MIME audio attachment 
record and play in Eudora email on the PalmOS.  Please ask in person 
if you see him in San Diego!




RE: Will Language Wars Balkanize the Web?

2000-12-05 Thread Martin J. Duerst

At 00/12/04 10:42 -0800, Christian Huitema wrote:
So, at a minimum, we need an IETF
specification on how to detect that a domain name part is using a non ascii
encoding, so that DNS servers don't get lost.

Why not just use UTF-8? It is an encoding of the UCS (aka
Unicode/ISO 10646), the encoding is fully compatible with
ASCII (all 7-bit bytes are ASCII and only ASCII), and it
is IETF policy (RFC 2277).

Regards,   Martin.





Re: Will Language Wars Balkanize the Web?

2000-12-05 Thread Martin J. Duerst

At 00/12/04 19:58 -0500, Eric Brunner wrote:
  I guess one of the first questions should be;  "Is some partitioning of 
 the
  Internet community such a bad thing?"...

If the "partition" intended for discussion is "@sign vs !path" addressing
conventions, Eric Allman and Peter Honeyman have left a discussion archive
on the subject.

Any pointers?

Arguably the universalist thesis understated the drawbacks
of anyone having the capability of addressing everyone anywhere. Clueless
users is only one possible policy model -- a point made by Peter then, and
equally valid now.

Personally I'm underwhelmed by the universalism advocated by the members
of the UNICODE Consortium, a single encoding scheme of necessity comes to
peripheral markets late in their adoption of computerized writing systems,
and their integration into a rationalized global system is not obviously a
boon to their pre-integration service models.

Unicode came late to everybody's adoption of computerization of writing.
Most probably the delay is much longer for central markets than for
peripheral markets, but that would have to be checked.

Also, one main factor in the delay in many cases is the amount of time
it takes for the specific 'market' to agree on a single encoding scheme,
or encoding table, locally. In some cases (e.g. Korean), this is due to
the wide range of choices that the script offers for encoding. In other
cases, this is due to the fact that it takes some time (up to one
generation) for all the people who have proposed and implemented
different encodings not only to realize that everybody would benefit
from a single encoding, but also to accept that to a large extent,
which single encoding is chosen is by way less important than that
a single one is chosen.


On the up-side, large user bases need not adapt to extraneous requirements
for participating in the "Internet community", and Universalist Credos may
fail in the markets (plural intended).

I think there is a difference between making it technically possible
for everybody to participate in whatever community they want, and
forcing anybody to do so. Internet technology has shown that it's
quite usable in local circumstances (the best example in Intranets).

Regards,   Martin.




Re: Will Language Wars Balkanize the Web?

2000-12-05 Thread Martin J. Duerst

At 00/12/04 08:15 -0500, Dave Crocker wrote:

Thank you.  I was hoping someone would point out the support for parallel 
operation so we could go further down that path.  As you note, it seems to 
be the closest to providing local/global support already.

That means postal gives us:

1. Global support for a common "character set"

2. Global support for a carefully mixed character set -- though really it 
is just a partitioning between the global field and the local field

3. Local support for a local character set.

(the support goes beyond character set, but let's leave it at that if 
that's ok.)

An immediate problem with comparing to postal is that it somewhat 
correlates with the path a letter will take, so that the incremental 
interpretation can be done by groups with different language skill-sets.

Really big post offices have special places to handle things such
as incomplete addresses. Nothing guaranteed, but if you are lucky,
you may even successfully send a letter from an arbitrary place to
anywhere in the world using local addressing, at least if you don't
forget the country name in the local script.


The DNS does not have that flexibility and the domain name interpretation 
is not part of the transfer sequence of the data.

Yes, there are quite some differences. The advantage we have is
that as soon as the characters are somehow in the computer,
everything else is mechanical. This means there is no need
for a global field; if somebody is able to type in the address,
that's it, the machine does the rest.


Schemes that put an ACE-like field into a .com might be considered to be 
like #2, above, by really they are not.  The whole string is still global.

ACE is (maybe) for machines. It's not primarily intended for humans.
We may have ACE all the way (including TLD). It might be usable as a
poor man's ASCII equivalent, but I strongly doubt that anybody will
want to have it on the Latin side of their name card.


Frankly this leaves me viewing the postal example as pretty unhelpful for 
finding a solution to the DNS requirement.

Well, the postal example shows how Latin and other scripts can
both be used to address something. The mixed case is not too
important for us, as discussed above.

In the postal example, conversion from one notation to the other
is a complex process (in particular for Japanese, lookup in context
is absolutely necessary). So I don't expect that something purely
mechanical (e.g. ACE) will do for DNS.


On the other hand, this thread was triggered by Graham's question about 
the negative impact of partitioning.  The postal example would seem to 
show that the effect is not so bad.




Except I would claim that it is not partitioning.  Note that an address 
always has a global representation, in addition to a possibly different 
local one.

It's a kind of partitioning, in that it is not always easy,
for everybody, to do use the 'local' address or to convert
from a local to a global one.


Perhaps that can reconciled as easily as claiming that any 'local' domain 
name must also have a global form? (But, somehow, the word "scaling" gets 
in the way of believing that.)

Scaling would be only by a factor 2.


Regards,   Martin.




Re: Will Language Wars Balkanize the Web?

2000-12-05 Thread vint cerf

however the value of the public Internet is surely in its widespread
accessibility and interoperability.

vint

At 05:10 PM 12/5/2000 +0900, Martin J. Duerst wrote:
I think there is a difference between making it technically possible
for everybody to participate in whatever community they want, and
forcing anybody to do so. Internet technology has shown that it's
quite usable in local circumstances (the best example in Intranets).




RE: Will Language Wars Balkanize the Web?

2000-12-05 Thread RJ Atkinson

At 02:53 05/12/00, Martin J. Duerst wrote:
At 00/12/04 10:42 -0800, Christian Huitema wrote:
So, at a minimum, we need an IETF
specification on how to detect that a domain name part is using a non ascii
encoding, so that DNS servers don't get lost.

Why not just use UTF-8? It is an encoding of the UCS (aka
Unicode/ISO 10646), the encoding is fully compatible with
ASCII (all 7-bit bytes are ASCII and only ASCII), and it
is IETF policy (RFC 2277).

All,

Please MOVE this conversation to the IDN WG list,
where it would be in scope.  Btw, this specific question
has been raised and answered several times now on the IDN list.
I encourage folks to read the sundry IDN proposals before
diving in any deeper here.

Thanks,

Ran




Re: Will Language Wars Balkanize the Web?

2000-12-05 Thread Randy Bush

 Really big post offices have special places to handle things such
 as incomplete addresses. Nothing guaranteed, but if you are lucky,
 you may even successfully send a letter from an arbitrary place to
 anywhere in the world using local addressing, at least if you don't
 forget the country name in the local script.

tagging, eh?




Re: Will Language Wars Balkanize the Web?

2000-12-05 Thread Eric Brunner

Martin,

I'll send you a copy of the "@sign vs !path" debate from my USENIX papers
archive. See "Pathalias: or The Care and Feeding of Relative Addresses" by
Honeyman and Bellovin, undated, at http://www.uucp.org/papers/pathalias.pdf.

Speculations on the general utility and availability of "single" encoding
schemes or some approximation of limited ambiguity code-set mapping(s)
should not displace actual data. The claim that iso10646 is "good" is not
improved by non-reference to the costs and benefits of ASCII-colliding
encodings (EBCDIC, SJIS, etc.), just as the "interoperability" claim is
not improved by non-reference to the operational deployment of serviceable
encoding.

Ignoring the daft peculiarities of particular encodings (and ANSI C) such
as NULLs in strings (or file names), what I learned from owning the i18n
problem at Sun was that a program of code-set indepence had time-to-market,
sustaining engineering, and ease of implementation arguements over a program
of opportunistic code-set dependence (the industry standard practice), and
as a matter of convience, that the XPG/3 locale model made a utf8 locale a
minor cost item, and an interal convenience mechanism. It was a compelling
case who's hardest technical issue was dynamic character width determination
in the bottom-half of the tty subsystem.

I mention this to contrast it with substition of UTF8 (or any fixed-width
multi-octet encoding scheme) dependence for ASCII dependence, or the common
form of an addition of an "alternate code path" which affords run-time
selection of one of two code-set dependent processing mechanisms.

From my perspective, the IETF has preferred the second form of solution to
the problem since the appearence of rfc2130. See also the following rfcs:
0373, 1345, 1468, 1489, 1502, 1555, 1557, 1815, 1842, 1922,
1947, 2237, and 2319.

As I pointed out to you over lunch Thursday at the W3C AC meeting, the i18n
problem is not simplified by the constraint which requires reference to
iso639, or iso3166. While few APRAnauts have an evident interest in the
problem of Euro-American Americanist hobbiests getting the fundamentals of
Cherokee wrong (or care that there are three Cherokee polities), in an ISO
normative reference (iso10646), on other lists (ICANN cluttered) Americans
of sundry "liberties" pursuasions are quite worked up that Euro-American
Sinology hobbiests are not, or may not, have precedence over Chinese
governmental and cultural institutions on the operational deployment of
Chinese language elements in the DNS (CNNIC vs Verisign).

A related question is whether the i18n problem is simplified by a constraint
which requires reference to the IAB Technical Comment on the Unique DNS Root,
a constraint which adds, without reflection, the constraints of iso3166 to
the dns-i18n problem set. Again, from my perspective, several sets of critics
of the IANA transition(s), and its reluctant proponents, have overloaded the
dns-i18n problem set as either an escape mechanism from uniqueness of the
DNS root, or as a problem which cannot be solved except by preservation of
the same property (uniqueness).

Neither party appear to be motivated by the interests of users of ASCII
colliding or pre-iso10646 (et alia) encodings, or users without practicable
means to use their preferred writing (or signing) systems.

Assuming a heterogenity of end-systems, each possibly with a heterogenous
set of character encoded applications with some cut-buffer mediation
mechanism, e.g., a (encoding-neutral or encoding-preferential) windowing
system for transparent, or converted reads and write operations between
end-system resident applications, and a DNS resolver library with access
DNS service, and no additional constraints (these are enough, thanks!),
is UTF-8 _the_ compelling answer?

The attractions of Universalism still appear to be compelling, only if some
non-technical, or ancilliary service model is controlling. Unfortunately,
the utility of Particularism is temporarily hijacked anywhere near the DNS
by partizans of one convention or its converse.

If next-hop has a case for forwarding, then it is surprising that the case
can't be applied to forwarding, except for opaque datagrams.

Cheers,
Eric

P.S. I forgot to work in NATs and VPNs. Sigh.




Re: Will Language Wars Balkanize the Web?

2000-12-05 Thread Patrik Fältström

At 18.05 +0900 00-12-05, Martin J. Duerst wrote:
ACE is (maybe) for machines. It's not primarily intended for humans.
We may have ACE all the way (including TLD). It might be usable as a
poor man's ASCII equivalent, but I strongly doubt that anybody will
want to have it on the Latin side of their name card.

I would, because I know that people in many parts of the world don't 
know how to enter "sömos" on their keyboard, and if I register the 
domain "snömos.se", I really want people to be able to get to

   http://www.snömos.se

so, if I think it is perfectly alright to have

   http://www.bq--abzw55tnn5zq.se

on my buissnes card (aswell).

paf


-- 




Re: Will Language Wars Balkanize the Web?

2000-12-05 Thread Masataka Ohta

Ran;

 At 02:53 05/12/00, Martin J. Duerst wrote:
 At 00/12/04 10:42 -0800, Christian Huitema wrote:
 So, at a minimum, we need an IETF
 specification on how to detect that a domain name part is using a non ascii
 encoding, so that DNS servers don't get lost.
 
 Why not just use UTF-8? It is an encoding of the UCS (aka
 Unicode/ISO 10646), the encoding is fully compatible with
 ASCII (all 7-bit bytes are ASCII and only ASCII), and it
 is IETF policy (RFC 2277).
 
 All,
 
 Please MOVE this conversation to the IDN WG list,
 where it would be in scope. Btw, this specific question
 has been raised and answered several times now on the IDN list.
 I encourage folks to read the sundry IDN proposals before
 diving in any deeper here.

IDN is the perfect place for repeated silly conversation like above.

But it is not the place to discuss localized domain name, which
has nothing to do with internationalization.

Masataka Ohta




Re: Will Language Wars Balkanize the Web?

2000-12-04 Thread Martin J. Duerst

At 00/12/03 08:03 +, Graham Klyne wrote:
There's a news story at:

   http://www.acm.org/technews/articles/2000-2/1201f.html#item10

under the heading "Will Language Wars Balkanize the Web?"

Leaving aside the issues of competing registries,

Sorry, but I think that's the main topic of the article (as far
as I can deduce from the abstract), and it is also the main
threat to create balkanization. The problem currently is not
that Chinese domain names may create a disconnect between
the "Chinese Internet" and some other part of the Internet,
but that there are various proposals and actors that are
working on Chinese domain names, and that all of them act
prematurely (i.e. before there is an IETF spec) and with
side interests that affect things negatively.


touched upon in that article, I had been wondering with the formation of 
IDN WG how I18N would affect cross-character-type-boundary Internet activities.

I guess one of the first questions should be;  "Is some partitioning of 
the Internet community such a bad thing?".  Why should it matter if, say, 
Chinese-based domains aimed at Chinese audiences are not meaningfully 
accessible to non-Chinese Internet users?

Reasonable question indeed. If the content is Chinese, does it hurt if the
address is also Chinese? There are cases where it indeed hurts (such as when
you have fonts to display Chinese on your system, but nothing to input
Chinese, as may be the case if you work off an English OS of some kind).
However, in general and for the majority of actual users (i.e. for
the Chinese users reading Chinese web pages,...), having Chinese
domain names is actually a big advantage. They are easier to
memorize, easier to guess, easier to identify with, and so on.


At a purely technological level, the priority ascribed to the end-to-end 
architecture of the Internet has underpinned and presumed 
non-discriminatory any-to-any communication.  I wonder if this is a 
reasonable expectation at the social level of Internet use.

At the *linguistic* level, there are certain rather hard boundaries
based on the difficulty of learning foreign languages and on the
slow advances of machine translation. At the social level, boundaries
should be kept as low as possible.

Regards,   Martin.




Re: Will Language Wars Balkanize the Web?

2000-12-04 Thread Martin J. Duerst

At 00/12/03 13:57 -0500, Dave Crocker wrote:
Would it be such a bad thing to be unable to postal mail a letter or 
package to anywhere in the world?

Of course it would be very bad. But it is usual now to send mail
e.g. from Japan to Japan with an address without any Latin letters.
It is also possible to send mail e.g. from the US or Europe to e.g.
Japan, with all but the country name in ideographs.

So the postal system is already now much closer to multilingual
domain names than to ASCII-only domain names.

It is also possible, as far as I understand, to send mail
with an address only written in Latin letters, to any country
in the world. The multilingual domain name solution should of
course provide a way (at least one way) to do this.

Please also note that Japanese name cards usually have two sides,
one in Japanese and one in Latin. Now, the email addresses on
both sides are the same, but in the future, you would just
use the one on the Latin side if you cannot type Japanese.


Regards,   Martin.




Re: Will Language Wars Balkanize the Web?

2000-12-04 Thread Dave Crocker


Thank you.  I was hoping someone would point out the support for parallel 
operation so we could go further down that path.  As you note, it seems to 
be the closest to providing local/global support already.

That means postal gives us:

1. Global support for a common "character set"

2. Global support for a carefully mixed character set -- though really it 
is just a partitioning between the global field and the local field

3. Local support for a local character set.

(the support goes beyond character set, but let's leave it at that if 
that's ok.)

An immediate problem with comparing to postal is that it somewhat 
correlates with the path a letter will take, so that the incremental 
interpretation can be done by groups with different language 
skill-sets.  The DNS does not have that flexibility and the domain name 
interpretation is not part of the transfer sequence of the data.

Schemes that put an ACE-like field into a .com might be considered to be 
like #2, above, by really they are not.  The whole string is still global.

Frankly this leaves me viewing the postal example as pretty unhelpful for 
finding a solution to the DNS requirement.

On the other hand, this thread was triggered by Graham's question about the 
negative impact of partitioning.  The postal example would seem to show 
that the effect is not so bad.

Except I would claim that it is not partitioning.  Note that an address 
always has a global representation, in addition to a possibly different 
local one.

Perhaps that can reconciled as easily as claiming that any 'local' domain 
name must also have a global form? (But, somehow, the word "scaling" gets 
in the way of believing that.)

d/

At 05:20 PM 12/4/00 +0900, Martin J. Duerst wrote:
At 00/12/03 13:57 -0500, Dave Crocker wrote:
Would it be such a bad thing to be unable to postal mail a letter or 
package to anywhere in the world?

Of course it would be very bad. But it is usual now to send mail
e.g. from Japan to Japan with an address without any Latin letters.
It is also possible to send mail e.g. from the US or Europe to e.g.
Japan, with all but the country name in ideographs.

So the postal system is already now much closer to multilingual
domain names than to ASCII-only domain names.

It is also possible, as far as I understand, to send mail
with an address only written in Latin letters, to any country
in the world. The multilingual domain name solution should of
course provide a way (at least one way) to do this.

=-=-=-=-=
Dave Crocker  [EMAIL PROTECTED]
Brandenburg Consulting  www.brandenburg.com
Tel: +1.408.246.8253,  Fax: +1.408.273.6464




Re: Will Language Wars Balkanize the Web?

2000-12-04 Thread Masataka Ohta

Dave;

 Thank you.  I was hoping someone would point out the support for parallel 
 operation so we could go further down that path.  As you note, it seems to 
 be the closest to providing local/global support already.

Silly comparison.

Efficient postal system works with numbers so called zip code.

Postal address with various characters needs human intervention for
complex matching and is similar not to DNS but to search engines.

Masataka Ohta




Re: Will Language Wars Balkanize the Web?

2000-12-04 Thread Valdis . Kletnieks

On Sun, 03 Dec 2000 16:00:53 PST, lists [EMAIL PROTECTED]  said:
 "I'm sorry, I'm not going to be able to figure out how to type that email
 address on my keyboard, could you please send me a message, and I'll just hit
 reply".

Wasn't there a Dilbert cartoon regarding sending a page to a pager number
containing a caret? ;)
-- 
Valdis Kletnieks
Operating Systems Analyst
Virginia Tech



 PGP signature


Re: Will Language Wars Balkanize the Web?

2000-12-04 Thread Valdis . Kletnieks

On Sun, 03 Dec 2000 13:17:45 EST, vint cerf [EMAIL PROTECTED]  said:
 to incorporate and refer to domain names. The IA4 alphabet includes essentially
 just the letters A-Z, numbers 0-9 and the "-" (dash). This is the limit of what
 is allowed in domain names today. 

The sad part is, of course, that RFC1035, section 3.1 specifically says
that any octet value is legal.

But I guess we're stuck with the IA4 charset ;(


-- 
Valdis Kletnieks
Operating Systems Analyst
Virginia Tech


 PGP signature


Re: Will Language Wars Balkanize the Web?

2000-12-04 Thread Robert G. Ferrell

Wasn't there a Dilbert cartoon regarding sending a page to a pager number
containing a caret? ;)

It was a tilde.

;-)

RGF

Robert G. Ferrell, CISSP
Information Systems Security Officer
National Business Center
U. S. Dept. of the Interior
[EMAIL PROTECTED]

 Who goeth without humor goeth unarmed.





RE: Will Language Wars Balkanize the Web?

2000-12-04 Thread Christian Huitema

 On Sun, 03 Dec 2000 13:17:45 EST, vint cerf [EMAIL PROTECTED]  said:
  to incorporate and refer to domain names. The IA4 alphabet 
 includes essentially
  just the letters A-Z, numbers 0-9 and the "-" (dash). This 
 is the limit of what
  is allowed in domain names today. 
 
 The sad part is, of course, that RFC1035, section 3.1 
 specifically says
 that any octet value is legal.

The restrictions that Vint mentions are actually restrictions on the domain
name part of email addresses, as specified in RFC-821. The DNS system itself
does not has such restrictions; this allows for example RFC 2782 to specify
the use of the "illegal" character _ (underline) in some domain name parts.
The main restriction in the DNS itself is the comparison rule embedded in
the system, that says that domain names are case independent. Case
comparison is indeed specific to the alphabet code, and in fact is often
times language dependent. The matter is already muddy for European
languages. In a case independent comparison in French, e-acute matches the
accentless e; in German, u-umlaut could match the digraph "ue"; DNS servers
don't do such matches, but at least they do the binary comparison right when
an 8-bit alphabet is a superset of ASCII. But the matter indeeds gets more
complex when the characters are encoded on 16 bits, when either the top or
the bottom could be misinterpreted as a lower or upper case ascii letter,
resulting in incorrect matches. So, at a minimum, we need an IETF
specification on how to detect that a domain name part is using a non ascii
encoding, so that DNS servers don't get lost.

-- Christian Huitema




Re: Will Language Wars Balkanize the Web?

2000-12-04 Thread Keith Moore

 So, at a minimum, we need an IETF
 specification on how to detect that a domain name part is using a non ascii
 encoding, so that DNS servers don't get lost.

We need a great deal more than that.

The real impact of internationalizing DNS names isn't with the DNS 
protocol or software itself (you can probably do it without any changes 
to these), it is the applications that make assumptions about character 
encodings used in DNS names and/or place their own limitations on the 
allowable characters in DNS names.  

Keith




Re: Will Language Wars Balkanize the Web?

2000-12-04 Thread Eric Brunner

 I guess one of the first questions should be;  "Is some partitioning of the 
 Internet community such a bad thing?"...

If the "partition" intended for discussion is "@sign vs !path" addressing
conventions, Eric Allman and Peter Honeyman have left a discussion archive
on the subject. Arguably the universalist thesis understated the drawbacks
of anyone having the capability of addressing everyone anywhere. Clueless
users is only one possible policy model -- a point made by Peter then, and
equally valid now.

Personally I'm underwhelmed by the universalism advocated by the members
of the UNICODE Consortium, a single encoding scheme of necessity comes to
peripheral markets late in their adoption of computerized writing systems,
and their integration into a rationalized global system is not obviously a
boon to their pre-integration service models.

 PS:  I think it is without doubt that it is a Good Thing that we make 
 efforts to internationalize protocols ...

Even less satisfactory is the practice of generalizing ASCII (nee BCD) to
encodings with more than 256 code points, via this universalist scheme and
no other. To advance from ASCII to ASCII-plus-UTF8 could be just as well
characterized as SJIS/GB/Big5/... (and their uses) depricated.

   ... my comments/questions are an 
 attempt to explore how far this process can reasonable go.

The i18n problem isn't trivial, and isn't advanced by  problematic essays,
good intentions, or American (actual and honorary) indulgences.

On the up-side, large user bases need not adapt to extraneous requirements
for participating in the "Internet community", and Universalist Credos may
fail in the markets (plural intended).

As for poking the ICANN mess in the eye with a sharpened brush on the IETF
list prior to a meeting, it is clumsy slight-of-hand and a poor substitute
for work on writing system support.  See also the W3C WAI for information
encoding and presentation systems which are not "writing".

Kitakitamatsinopowaw,
Eric




Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread Masataka Ohta

Graham;

 Leaving aside the issues of competing registries, touched upon in that 
 article, I had been wondering with the formation of IDN WG how I18N would 
 affect cross-character-type-boundary Internet activities.

Nothing.

Cross-character-type-boundary is a pure localization issue
and has nothing to do with people wrongly working on I18N.

 PS:  I think it is without doubt that it is a Good Thing that we make 
 efforts to internationalize protocols;

If only you understand what "internationalize protocols" mean.

ASCII (latin, numeric and hypen) characters are the only characters
internationally recognizable by so many people.

Masataka Ohta




Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread RJ Atkinson

At 03:03 03/12/00, Graham Klyne wrote:
I guess one of the first questions should be;  "Is some partitioning of the Internet 
community such a bad thing?"

A partioning based on nationality, which is of course
different than language group, would be harmful.  Lack of
interoperability of standard protocols would be bad, for
whatever reason, including incompatible localisations.  Lack
of standards support for internationalisation/multi-lingual
computing, as different from localisation, would also be bad.

  Why should it matter if, say, Chinese-based domains aimed 
at Chinese audiences are not meaningfully accessible to 
non-Chinese Internet users?  

What about people who can read and perhaps also write
in Chinese characters but who are not Chinese (either ROC
on Taiwan or PRC on the mainland) nationals ?  Consider
not only folks in Singapore or SE Asia generally, but also 
Chinese-capable folks in other places (e.g. North America, 
Europe).  [NB: I'm deliberately ignoring the issues with 
Traditional vs Simplified characters just now, though that
is also part of the internationalisation equation].  

I regularly read my news from British or Hong Kong
or other countries' web sites.  Living in North America,
I'm certainly not the target audience for the HK Standard
or South China Morning Post.  However, I do read those 
newspapers online.  Less regularly, but occasionally,
I do read Chinese web sites (in Chinese) or Japanese web
sites (reading the Kanji portion only).  I am most assuredly 
NOT the target audience for any of these web sites.

On a daily basis, I receive mail with Chinese language
contents, though a surprising amount of that turns out to
be unsolicted bulk email in my own case.  I receive a modest
amount of German or Vietnamese email.  So multi-lingual protocol
capabilities are quite important to me.

So for all those reasons, it does in fact matter
a great deal.

At a purely technological level, the priority ascribed to the end-to-end architecture 
of the Internet has underpinned and presumed non-discriminatory any-to-any 
communication.  I wonder if this is a reasonable expectation at the social level of 
Internet use.

I do think so.

PS:  I think it is without doubt that it is a Good Thing that we make efforts to 
internationalize protocols;  my comments/questions are an attempt to explore how far 
this process can reasonable go.

I don't want to try to predict the future, so I won't.  
I can say that today, we are NOT anywhere close to a reasonable 
end point or stopping point for internationalisation of IETF 
standards-track protocols.  In particular, we haven't resolved
the basic internationalisation issues for a number of core 
infrastructure protocols (e.g. DNS).

Regards,

Ran
[EMAIL PROTECTED]




Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread Dave Crocker

At 08:03 AM 12/3/00 +, Graham Klyne wrote:
I guess one of the first questions should be;  "Is some partitioning of 
the Internet community such a bad thing?".

Would it be such a bad thing to be unable to make a phone call to anywhere 
in the world?

Would it be such a bad thing to be unable to postal mail a letter or 
package to anywhere in the world?

d/

ps.  strictly rhetorical questions, as I hope is obvious.


=-=-=-=-=
Dave Crocker  [EMAIL PROTECTED]
Brandenburg Consulting  www.brandenburg.com
Tel: +1.408.246.8253,  Fax: +1.408.273.6464




Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread vint cerf

In my opinion, it is vital to craft Internet's evolution so as to maintain
full connectivity and interworking among all its parts. I do not see
"balkanization" as a good thing at all. I believe there are sound technical
means to achieve the objective of incorporating character sets associated
with non-roman languages but that critics need to understand more fully just
how important the limitations of the current character set for domain names
have been in maintaining interworking and also ability of so many applications
to incorporate and refer to domain names. The IA4 alphabet includes essentially
just the letters A-Z, numbers 0-9 and the "-" (dash). This is the limit of what
is allowed in domain names today. 

Incorporating other character sets without deep technical consideration will
risk the inestimable value of interworking across the Internet. It CAN be done
but there is a great deal of work to make it function properly.

Vint

At 08:03 AM 12/3/2000 +, Graham Klyne wrote:
There's a news story at:

  http://www.acm.org/technews/articles/2000-2/1201f.html#item10

under the heading "Will Language Wars Balkanize the Web?"

Leaving aside the issues of competing registries, touched upon in that article, I had 
been wondering with the formation of IDN WG how I18N would affect 
cross-character-type-boundary Internet activities.

I guess one of the first questions should be;  "Is some partitioning of the Internet 
community such a bad thing?".  Why should it matter if, say, Chinese-based domains 
aimed at Chinese audiences are not meaningfully accessible to non-Chinese Internet 
users?  At a purely technological level, the priority ascribed to the end-to-end 
architecture of the Internet has underpinned and presumed non-discriminatory 
any-to-any communication.  I wonder if this is a reasonable expectation at the social 
level of Internet use.

#g

PS:  I think it is without doubt that it is a Good Thing that we make efforts to 
internationalize protocols;  my comments/questions are an attempt to explore how far 
this process can reasonable go.


Graham Klyne
([EMAIL PROTECTED])




Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread Betsy Brennan

But the Internet is not the postal system nor the phone system. We already
have the postal system and the phone system.  They may be slower, but does
that mean they should be replaced or that the Internet must duplicate what
these systems do? BLB

Dave Crocker wrote:

 At 08:03 AM 12/3/00 +, Graham Klyne wrote:
 I guess one of the first questions should be;  "Is some partitioning of
 the Internet community such a bad thing?".

 Would it be such a bad thing to be unable to make a phone call to anywhere
 in the world?

 Would it be such a bad thing to be unable to postal mail a letter or
 package to anywhere in the world?

 d/

 ps.  strictly rhetorical questions, as I hope is obvious.

 =-=-=-=-=
 Dave Crocker  [EMAIL PROTECTED]
 Brandenburg Consulting  www.brandenburg.com
 Tel: +1.408.246.8253,  Fax: +1.408.273.6464




Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread Randy Bush

 But the Internet is not the postal system nor the phone system. We already
 have the postal system and the phone system.  They may be slower, but does
 that mean they should be replaced or that the Internet must duplicate what
 these systems do?

i am sorry, but i can not understand the above.  perhaps you were writing in
californian.  qed.

randy




Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread Kimon A. Andreou

But isn't the Internet a medium of communication as is the Post and the
telephone?
Therefore, shouldn't it support communication between any two points,
wherever they may be or however they're called?

Kimon


- Original Message -
From: "Betsy Brennan" [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Sunday, December 03, 2000 15:06
Subject: Re: Will Language Wars Balkanize the Web?


 But the Internet is not the postal system nor the phone system. We already
 have the postal system and the phone system.  They may be slower, but does
 that mean they should be replaced or that the Internet must duplicate what
 these systems do? BLB




NetZero Free Internet Access and Email_
Download Now http://www.netzero.net/download/index.html
Request a CDROM  1-800-333-3633
___




Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread Kimon A. Andreou


- Original Message -
From: "R . P . Aditya" [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Sunday, December 03, 2000 16:20
Subject: Re: Will Language Wars Balkanize the Web?


snip
 You can't address a letter to someone in Berkeley, USA in nagari or
amharic
 characters and expect it to reach. However you can address a letter to
someone
 in Addis Ababa, Ethiopia in ASCII characters with a poor-phonetic
 approximation and expect it to reach (choice of locales based on
experience).

snip

 Adi

But don't packets get routed using IP addresses  (i.e. numbers) ?

Kimon

___
Why pay for something you could get for free?
NetZero provides FREE Internet Access and Email
http://www.netzero.net/download/index.html





Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread Dave Crocker

Kimon gets a A.  Betsy gets an F.

d/

At 03:30 PM 12/3/00 -0500, Kimon A. Andreou wrote:
But isn't the Internet a medium of communication as is the Post and the
telephone?
Therefore, shouldn't it support communication between any two points,
wherever they may be or however they're called?

Kimon
- Original Message -
From: "Betsy Brennan" [EMAIL PROTECTED]
  But the Internet is not the postal system nor the phone system. We already
  have the postal system and the phone system.  T

=-=-=-=-=
Dave Crocker  [EMAIL PROTECTED]
Brandenburg Consulting  www.brandenburg.com
Tel: +1.408.246.8253,  Fax: +1.408.273.6464




Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread lists



On Sun, Dec 03, 2000 at 04:56:38PM -0500, Kimon A. Andreou wrote:
 snip
  You can't address a letter to someone in Berkeley, USA in nagari or
 amharic
  characters and expect it to reach. However you can address a letter to
 someone
  in Addis Ababa, Ethiopia in ASCII characters with a poor-phonetic
  approximation and expect it to reach (choice of locales based on
 experience).
 
 snip
 
  Adi
 
 But don't packets get routed using IP addresses  (i.e. numbers) ?

er, wrong layer. Although I'm as good at remembering IP addresses as phone
numbers, you'll have a hard time convincing others to give up DNS.

"I'm sorry, I'm not going to be able to figure out how to type that email
address on my keyboard, could you please send me a message, and I'll just hit
reply".

Adi




Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread Randy Bush

 "I'm sorry, I'm not going to be able to figure out how to type that email
 address on my keyboard, could you please send me a message, and I'll just hit
 reply".

if the app-presentation - internal coding - dns request mapping is not
one:one and reversable on the other end, even this is not sure to work.

randy




Re: Will Language Wars Balkanize the Web?

2000-12-03 Thread Kimon A. Andreou


- Original Message -
From: "lists" [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Sunday, December 03, 2000 19:00
Subject: Re: Will Language Wars Balkanize the Web?




 "I'm sorry, I'm not going to be able to figure out how to type that email
 address on my keyboard, could you please send me a message, and I'll just
hit
 reply".

 Adi



Good point.

I didn't think about e-mail addresses.


Kimon




_NetZero Free Internet Access and Email__
   http://www.netzero.net/download/index.html