RE: Last Call on Language Tags (RE: draft-phillips-langtags-08)

JFC (Jefsey) Morfin Tue, 04 Jan 2005 10:47:04 -0800

Dear Peter, I am sorry to comment this again. But this is a Last Call over a private proposition. There is no other forum to comment this key document for the future of the Internet. There is also no other forum to correct what you say on me.

I whish to recall that the main issues are the pretence of the draft to obsolete RFC 3066 while being sometimes conflicting and to extend its scope without limit (cf. Addison Phillips comment) what would be an IESG commitment on the whole multilingual internet architecture.

I wish also to underline that I agreed with you on many points during the private list discussion and private mails.

At 03:58 04/01/2005, Peter Constable wrote:

For the past several years the majority of my work has been related to
standards pertaining to IT globalization in one way or another, and I
have encountered a few nexus of people interested in metadata elements
for describing linguistic properties of content; a number of the people
I have encountered in these contexts have congregated (metaphorically)
on the IETF-languages list, and a number of those have provided input on
this draft.

Hopefully a few people have congregated to support the proposed draft. Now their positions are to match a consensus process. If your propositions are not harming anyone and being usefull to some there is no reason not to have a consensus. Today the consensus inclines to say that they might be harmfull and should therefore be reserved to those who need them. Concern is that this might make them irreversibly incompatible. And that the benefits (for them and others) are not clear. The target is to try to clarify.

In each of these contexts, I have encountered general agreement with the idea that it is appropriate to include writing-system distinction as part of language tags; after some time, it has only been in the past couple of weeks that I have encountered people who have questioned the decision to incorporate script IDs, and all of these have been people who have not been subscribed to the IETF-languages list, or at least have not been active contributors to discussion on that list.


I suppose I am among that "seldom" new comers. So let me comment on that:

1. I incorporated my international users need support organization in 1978 :-) 2. I never objected the scripting-ID. I objected that it was not given the same importance as language and country codes. I plead (and act) for 25 years for the support of authoritative distinctions among users contexts. But I am not paid by a big employer. 3. I objected the scarcity of possible tags 4. I objected the exclusiveness in a registration approach versus a desription approach. 5. I supported the proposed scheme as long as its scope of application was defined and not a take-over on the multiligual Internet.

Last but not least, I received enough off list support to accept to spend time on this. There is NO consensus in the community and huge technical, societal, economical and political concerns. Because one does not understand what the Draft wants to achieve, for who and how. The main request is to clarify. There are no real objections (except to the paucity of the proposition) but concerns.

> It would be very helpful, to me at least, if you or he could
> identify the specific context in which such tags would be used
> and are required.  The examples should ideally be of
> IETF-standard software, not proprietary products.


You respond none. Just an application level problem.

I've used Chinese as one example, but there are many other cases, some
familiar to many and some less well known. Also, in relation to IETF
protocols, I mentioned only HTTP, but the same problems likely exist for
other protocol involving textual linguistic content where RFC 3066 is
used. For example, in searching for items in an LDAP directory, in may
be appropriate for an AttributeDescription to specify Tradition Chinese
rather than Simplified Chinese, or Serbian using the Latin-script
orthography vs. Serbian using the Cyrillic-based orthography.

Full agreement. But this is to be done through an open and inclusive semantic, not on an exclusive first come first serve registration basis. Next setp will be patents on languages descriptions.

In ideal terms, I do not think that all of the complexity of the
proposed draft is needed.

So let simplify it, and let deep into the areas were complexity comes from limited possibilities.

On the other hand, I think that some people's
characterization of the excessive complexity has been overstated, some
of the complexity I consider superfluous but not particularly harmful
(notably the extensions), and some of the complexity I think is an
unfortunate result of existing implementations and past practice (in
particular, the steps taken to avoid instability of ISO 3166 and the use
of both UN numeric IDs and ISO 3166 due to the combination of prior
usage of ISO 3166-1 together with the need for region identifiers other
than those provided by ISO 3166-1).


Complexity in real life issues comes often from:
- patching previous mistakes
- patching the reigidity introduced by previous simplifications.

Part of my reluctance to have script IDs included in RFC 3066 was due to
the fact that a set of tags had just been registered (some of which I
now wish didn't exist) which used various subtags in combination, and I
sensed that there was a lack of collective understanding of what the
internal structure of tags and relationships between subtags should be
(which is a direct cause that led me to write the paper I referred to
earlier).

This documents that "collective community thinking" is not always correct. This is why I am reluctant to any registration process on the (propritary ?) blend of existing open tags list.

I have been party to the review process for the past five or so years,
and can say that the review process did not, IMO, always succeed in
avoiding regretable tags (I do not consider those that include script
IDs to be among them) because there was a lack of a model of what
ontology was needing to be described and what the appropriate elements
within a tag standing in what kind of relationship to one another were
needed. This draft doesn't describe such a model, but it does impose
one, which I think is moving in a good directiton.

!!!!!!! do you actually say that the value of this draft is to impose the dearly missing model in ... not describing it ????? Or is it misreading of mine?

Actually, no; I was trying to guess at existing applications that might
have particular problems with complexity, as you mentioned. Certainly
language-range matching is no more complex in the proposed draft than it
is today. I personally suspect that the language-range matching
algorithm is too simplistic, but I haven't gone beyond that myself to
start suggesting it needs to be replaced with something more complex.

Why do you want there would be an exclusive _unique_ matching algorithm? It is up to the application to decide of the algorithm when receiving a tag and possibly negotiate it in a web service or in actionning an OPES.

For my part, I made a point of informing TC 37 members of the
re-assignment of CS, and that led to a resolution at our Paris meeting
last August expressing strong concern over this. I did not ever hear any
response from either TC 46 or the ISO 3166 MA on this matter, however. I
don't know that I would have devised the approach to the handling of
this issue used in this draft had I been its author. I am deeply
concerned that stability be ensured in language tags, however, and if
this is the only way to ensure it I can accept it.

We had a long talk at the end of the August Paris meeting at AUF over ISO 639-2 and the need to aggregate language ID, scripting ID, usage description, authoritative sources and also country codes and on the complexity to take into account "sub-code" and private codes and to add accidental or new descriptors in order to document venacular ways of speaking, thinking, talking. Obviously it was a private discussion with a few people sharing the same ideas ... May be you were there (we were the last to leave the room and the building).

All the best.
jfc


_______________________________________________
Ietf mailing list
Ietf@ietf.org
https://www1.ietf.org/mailman/listinfo/ietf

RE: Last Call on Language Tags (RE: draft-phillips-langtags-08)

Reply via email to