IANA, Unicode, and the multilingual Internet

2006-07-24 Thread Martin Duerst
At 04:05 06/07/23, JFC Morfin wrote:

4.3. IANA registries.    In the case of IANA registries there is no market 
alternative [we saw that in the alt-root case]. The control of a IANA registry 
can therefore be strategic. Until now the IANA had three main areas: numbers, 
names, protocol parameters. The numbers/names are pure Internet issues but 
were considered sensible enough to be delegated to ICANN. The new area of 
languages

This is not a new area. IANA has managed a language tag registry
since around 1995 (see RFC 1766). But it is important to note that
IANA just registers language tags (or since recently, language
subtags), not languages. 


is not an Internet issue,

RFC 1766, RFC 3066, as well as its approved successors
(draft-ietf-ltru-registry, draft-ietf-ltru-initial and
draft-ietf-ltru-matching) only deal with language tags
on the Internet. It is difficult to understand how language
tagging on the Internet would not be an Internet issue.


is far more important and sensible than names and numbers,

I wouldn't be co-chair of the LTRU WG if I wouldn't believe
that language tagging is important, but there are far more
important issues (it's e.g. easy to show that 'charset'
tagging is much more important than language tagging,
because the consequences of failures are much greater).

Also, I agree that language tagging occasionally can be
a sensible issue (a look at the [EMAIL PROTECTED]
mailing list would definitely give that impression), but
by and large, most language tags are used in practice
without any problems.


and is de facto [this is what I object] delegated to UNICODE.

It's difficult to object to something that isn't the case.
The language subtag registry is de facto delegated to ISO
(for language codes, country codes, and script codes) because
the IANA registry (except for blunders by ISO that we hope
they won't make anymore) just reflects the relevant ISO
standards. Of the above three kinds of codes, language codes
are obviously the most important (no language tag without a
language code), and script codes are the least important
(most language tags don't need a script code). The Unicode
consortium is designated as the for registration authority
for script codes. But this doesn't mean that they can assign
new script codes at will; ISO 15924 (see e.g.
http://www.unicode.org/iso15924/standard/) describes that
new codes need at least four positive votes from the six
voting members of the Joint Advisory Committee. Only one of
these members is from the Registration Authority (Unicode),
all the others are from other, ISO-related, organizations.


The IETF is obviously not prepared to this kind of fundamental conflict.

In order to talk about whether the IETF is prepared for a certain
kind of conflict, we first would need to know what kind of conflict
this is. But I can't find any fundamental conflict in the paragraph
above.


5. IETF strategy. There are cases where a possible solution is a significant 
change of the IETF, or even to kill the IETF itself. The conflict I am engaged 
into, is certainly of that nature. RFC 3935 gives IETF leaders the capacity 
to address such situations, except when the opposed option is defended by 
one/several IETF leaders. We should not consider that such conflicts are 
exceptional: the lack of architectural guidance by the IAB raises several 
other issues. After the Multilingual Internet, what about the multilateral, 
the multitechnology, etc. support?

There are two ways to understand Multilingual Internet above.
One is that the Internet is already to the most part multilingual:
There are Web pages in a large number of langugages, emails are sent
around daily in a similar number of languages, and so on, and some
of the remaining issues, mostly in the area of identifiers, are either
on the verge of being fully deployied (IDN) or at least work has started
(internationalized email addresses).

The other way to understand Multilingual Internet is that the
Multilingual Internet is something completely different from what
we have now, much more multilingual for the end user, or whatever.
But while we have heard much buzzwording about that, we haven't
seen any of that in any actual kind, shape, or form, nor have we
actually been told what it's going to look like, or how it's going
to be better than what we have now (see previous paragraph).
So it's vaporware even by the standards of vaporware.

A similar analysis can be made for multilateral and multitechnology
above. Of course the Internet is multilateral, it allows multiple
parties to communicate with each other. Of course it is multitechnology,
on many levels (from the physical and link layer up to the applications
layer).

Regards,Martin.




#-#-#  Martin J. Durst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp   mailto:[EMAIL PROTECTED] 


___
Ietf mailing list
Ietf@ietf.org

Re: IANA, Unicode, and the multilingual Internet

2006-07-24 Thread JFC Morfin

Dear Martin,
Thank you for your comment. It makes plain we belong to two different 
worlds. My concern is the interoperability of these two worlds. My 
problem is your difficulty to realise that your world is not the only 
one on earth. Let go through your mail to try to understand why.


At 11:17 24/07/2006, Martin Duerst wrote:

At 04:05 06/07/23, JFC Morfin wrote:

4.3. IANA registries.    In the case of IANA registries there 
is no market alternative [we saw that in the alt-root case]. The 
control of a IANA registry can therefore be strategic. Until now 
the IANA had three main areas: numbers, names, protocol parameters. 
The numbers/names are pure Internet issues but were considered 
sensible enough to be delegated to ICANN. The new area of languages


This is not a new area. IANA has managed a language tag registry
since around 1995 (see RFC 1766). But it is important to note that
IANA just registers language tags (or since recently, language
subtags), not languages.


This is both true and untrue. The new language registries subtags and 
extensions have full autonomy in their area, while the former langtag 
registries was not much used (72 entries in 10 years). The capacity 
of the new registry is important (440 languages, 100 scripts, 250 
country codes which can be organised together to build thousands of 
langtags). The capacity of extensions is limitless. Technology will 
reasticly support sociolects and idiolects, meaning billions of tags.


It is also correct to say that the IANA just registers language 
subtags and not languages. This means labels used to build the 
designation of a language. This means that if you cannot tag (name 
with the RFC 3066 Bis format) a language, that language just do not 
exist in the digital system using that tagging. This may no be of 
importance in an obscure local application, this is not the same for 
the whole Internet.



is not an Internet issue,

RFC 1766, RFC 3066, as well as its approved successors
(draft-ietf-ltru-registry, draft-ietf-ltru-initial and
draft-ietf-ltru-matching) only deal with language tags
on the Internet. It is difficult to understand how language
tagging on the Internet would not be an Internet issue.


Domain Names are an Internet issue. IP addresses also are. Their 
concept originates in the Internet. Languages concepts do not 
originate in the Internet technology. Language Tags permit the 
Internet technology to interface the Language reality. The importance 
of the Internet in the world life make a conflict between the 
Language reality and the Internet Language support a major political, 
societal and economic conflict.


The point is to know who is the master and who is the slave: the man 
or the machine. The IETF or the people. Should the RFC adapt to users 
or users to RFCs. With in background the fact that if people are to 
adapt to RFC, they will have to adapt to the concepts and interests 
of those who wrote the RFC. RFC 3935 answers that point: people are 
to be influenced by the IETF in the way they design, use and manage 
the Internet for the Internet to work better.


I accept that in a technical vision of the world you can think this 
is a good thing. I acknowledge that you may/can want to develop it. 
However, I cannot support you there: I have a significant (quasi 
universal) different vision. I serve the users rather than 
influencing them. Your vision is exclusive and wants to exclude mine 
(we saw it). Mine is to support everyone, including you - this is why 
I needed you to clearly define the way your system is to work.



is far more important and sensible than names and numbers,

I wouldn't be co-chair of the LTRU WG if I wouldn't believe
that language tagging is important, but there are far more
important issues (it's e.g. easy to show that 'charset'
tagging is much more important than language tagging,
because the consequences of failures are much greater).


I am afraid you are trapped by your own conceptions and strategy. 
Language tagging or charsets are technical concepts. Reality is made 
of languages, graphemes, phonemes, etc. people, cultures, history, 
countries, etc. What you discuss here is related to the limitation of 
your concepts. You just tell that Language Tags (which are the IETF 
interface to languages) should consider charsets before scripts.


You may remember that you opposed this I explained.

We both know the reason why Unicode chose ISO 15924 and scripts 
rather than keyboards and charsets. And that reason is not technical. 
I do not share that reasons. I do not use ISO 15924 except for what 
it is: a list of tags you can use to qualify charsets.



Also, I agree that language tagging occasionally can be
a sensible issue (a look at the [EMAIL PROTECTED]
mailing list would definitely give that impression), but
by and large, most language tags are used in practice
without any problems.


This is a ... premature affirmation. Up to now there were 72 IANA 
language tags and a lose