Re: Reply-To mess opinion [was Re: Unicode on a non-Unicode

2000-09-18 Thread John Wilcock

On Sun, 10 Sep 2000 04:39:22 -0800 (GMT-0800), Harald Alvestrand
wrote:
 I have the opposite experience from Simon Hill: Most reply-to munging lists 
 get regular complaints, those that don't do it don't want it.

My experience is that the complaints arise when the list
*systematically* adds a Reply-To, overwriting any reply-to that may
have been set by the sender. 

Lists which add a Reply-To if and only if the original message *does
not* contain this header are, in my experience, trouble-free.

John.

-- 
-- Over 1200 webcams from ski resorts around the world - 
http://www.tradoc.fr/john/webcams/
-- Translate your technical documents and web pages- http://www.tradoc.fr/en/



Fwd: I-D ACTION:draft-duerst-i18n-norm-04.txt

2000-09-18 Thread Martin J. Duerst


To: IETF-Announce: ;
From: [EMAIL PROTECTED]
Reply-to: [EMAIL PROTECTED]
Subject: I-D ACTION:draft-duerst-i18n-norm-04.txt
Date: Thu, 14 Sep 2000 06:57:36 -0400
Sender: [EMAIL PROTECTED]

A New Internet-Draft is available from the on-line Internet-Drafts 
directories.


 Title   : Character Normalization in ITEF Protocols
 Author(s)   : M. Duerst, M. Davis
 Filename: draft-duerst-i18n-norm-04.txt
 Pages   : 12
 Date: 13-Sep-00

The Universal Character Set (UCS) [ISO10646, Unicode] covers a very
wide repertoire of characters. The IETF, in [RFC 2277], requires that
future IETF protocols support UTF-8 [RFC 2279], an ASCII-compatible
encoding of UCS. The wide range of characters included in the UCS has
lead to some cases of duplicate encodings. This document proposes
that in IETF protocols, the class of duplicates called canonical
equivalents be dealt with by using Early Uniform Normalization
according to Unicode Normalization Form C, Canonical Composition (NFC)
[UTR15]. This document describes both Early Uniform Normalization
and Normalization Form C.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-duerst-i18n-norm-04.txt

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
 "get draft-duerst-i18n-norm-04.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
 [EMAIL PROTECTED]
In the body type:
 "FILE /internet-drafts/draft-duerst-i18n-norm-04.txt".

NOTE:   The mail server at ietf.org can return the document in
 MIME-encoded form by using the "mpack" utility.  To use this
 feature, insert the command "ENCODING mime" before the "FILE"
 command.  To decode the response(s), you will need "munpack" or
 a MIME-compliant mail reader.  Different MIME-compliant mail readers
 exhibit different behavior, especially when dealing with
 "multipart" MIME messages (i.e. documents which have been split
 up into multiple messages), so check your local documentation on
 how to manipulate these messages.


Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.
Content-Type: text/plain
Content-ID: [EMAIL PROTECTED]

ENCODING mime
FILE /internet-drafts/draft-duerst-i18n-norm-04.txt

ftp://ftp.ietf.org/internet-drafts/draft-duerst-i18n-norm-04.txt




Re: [idn] nameprep forbidden characters

2000-09-18 Thread Martin J. Duerst

I think it's very useful to know about the problems of Hebrew software
with points, and about the problems that Hebrew users have with using
points. And Jonathan is definitely in the best position to know about
that.

However, that doesn't mean that the best solution is to ignore points
on the client side. For example, Yiddish uses pointed letters in quite
a bit a different way; they cannot be ignored. The same may apply to
other languages written with the Hebrew script. There may also be cases
where a point can indeed make a difference.

One solution to the problem is obviously to ignore it. If Jonathan
is true, registering names with points won't be attractive, and so
there will automatically be very few registrations with points.
If it's difficult for users to input the points, then they will
be very much at ease with just inputing the base letters. Everything
will work together. For those cases where it's necessary to make
a difference (e.g. Yiddish), there won't be any problems.

Regards,   Martin.

At 00/09/17 10:04 -0700, Mark Davis wrote:
I am curious why you feel so strongly that the Hebrew points should be ignored
in domain names. Prima facie, it seems that there is little harm in treating
them no differently from other characters. What problem would arise if the
domain was ABC.COM and I could not get it by typing AB*C.COM? (Here uppercase
stands for Hebrew, and * for a point.) Conversely, if someone really did
register AB*C.COM, would it be a problem that I couldn't get to that 
location by
typing ABC.COM?

It is my understanding that the vowels are rarely used, and that people really
wouldn't use them in registered domain names anyway. It seems that if someone
did take the trouble to type in the points, that there would be a reason for
their making such a distinction.

I'd appreciate it if you could help me to understand the issue more clearly.

Mark

Jonathan Rosenne wrote:

  We should distinguish "punctuation", like 060C Arabic Comma, and
  "diacritics", such as 064E Arabic Fatha. Diacritics is probably the wrong
  word. I have the impression that you were referring to the latter.
 
  For Hebrew, my opinion is that from the point of view of the user,
  punctuation should be forbidden, while diacritics such as the vowels and
  other combining characters should be allowed and be ignored.
 
  I believe it is important that the rules for Arabic and Hebrew should 
 be the
  same as far as possible.
 
  Jony
 
   -Original Message-
   From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
   Behalf Of Wael Nasr
   Sent: Saturday, September 16, 2000 1:16 AM
   To: Edmon; idn working group; Adam M. Costello
   Subject: RE: [idn] nameprep forbidden characters
  
  
   Wanted to share with you that in the arabic Working group of minc we have
   discussed this
   point at length.
   In arabic the meaning of the word will change depending on punctuation ,
   like the
   words "knowlege" and "flag" in arabic are exactly the same except for
   punctuation.
  
   It is my opinion that , at least regarding arabic, no punctuation
   should be
   allowed for now.
  
   I am sure 5 years from now , domain name systems will be much more 
 dynamic
   than what
   we have now and will not be simply a simple mapping of unicode or 
 ascii to
   an ip number.
   at that time, punctuation can be allowed to be part of the game.
   wael
  
   ---
   Wael Nasr
   Director, Middle East Business Development
   I-DNS.net
   [EMAIL PROTECTED]
   Cell Phone(Egypt):+(201) 222 55 380
  
   -Original Message-
   From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of
   Edmon
   Sent: Saturday, September 16, 2000 12:59 AM
   To: idn working group; Adam M. Costello
   Subject: Re: [idn] nameprep forbidden characters
  
  
 Perhaps host names
should avoid all punctuation in all languages so people don't have to
worry about it.
  
   I think we have to remember that it is the registrant's choice to 
 choose a
   name that best reflects their identity online.  Punctuations may
   serve to be
   great symbols that identifies an entity, for example a person
   called O'Brian
   would want to have the apostrophe for his domain name and a company AB
   would want the "" in their name.  Our move to multilingual is the best
   opportunity for us to re-include these worthwhile and long awaited 
 symbols
   back into the domain name space.
  
   Edmon
  
   
AMC
  
  
  





http://www.unicode.org/unicode/standard/standard.html

2000-09-18 Thread Otto Stolz

Hello,

I had written:
 I have re-read section "Controlling Ligatures", in TUC 3.0, p. 318.

Am 2000-09-15 um 14:40 UCT hat Mark Davis geschrieben:
 I'd like to remind everyone to look at the latest version of the Unicode
 Standard, especially when looking at fine points. To cite Unicode 3.0.1
 (http://www.unicode.org/unicode/standard/versions/Unicode3.0.1.html)

Thank you for this pointer.

Actually, I had looked for this information before I wrote my previous
note, but could not find it; hence, I resorted to TUS 3.0.

The reason for my failure to locate 3.01 on the Unocode WWW site is an
improper organisation of the latter: I followed this trail:
- http://www.unicode.org/: About the Unicode/Standard
  - http://www.unicode.org/unicode/standard/standard.html:
Version 3.0
The most current major version of the Unicode standard...
Here ended my search. It did not occurr to me that I would find an even
more current version of the Unicode standard under the "Versions of the
Unicode Standard" link.

So please make sure that the wording on your "Unicode Standard" page is
not misleading, and that there is a conspicuous link to the current
standard, even if it is not a major version.

Thank you, and best wishes,
   Otto Stolz



Re: New Locale Proposal

2000-09-18 Thread Antoine Leca

I do not know if this proposal is good or evil.
But in any case there are some points that need to be enhanced IMHO.

Carl W. Brown wrote:
 
 The locale will consist of three parts:
 
 1) A modified lower case RFC 1766bis language
 
 2) An ISO 3166 country code

Can you allow for areas that are a little bigger ?
The first obvious case is the EU (but I believe it may soon become a
ISO 3166 code). Problematic cases also include the Arabic countries
and the Spanish America, where the unity of language conjugated with
the differences in countries create a long list of almots completely
virtual locales (that is, outside the need to tag monetary amounts,
these locales are non-informative). Same problem for French in
Africa and, to a lesser extend, English on wide areas on Earth.
 
 3) A variant
 


 The modifications to RFC 1766bis to make to better suited for locales are as
 follows:
 
 1) Normalize to single form when possible.  Use ISO 639-1 code instead of
 639-2 if one exists.

Are you forced to re-tag every bit of data when ISO 639/RA issues a
new code?



 3) Variants that are not related to language are locale variants.
 fr_FR_EURO

Can *please* people avoid this abuse of the variant idea?

We are at less than 16 months from the end of the use of FRF. So in
16 months from now, the "fr_FR" locale will become completely
indistinguishable from your example.  Unless you want to force us to
leave the "fr_FR" and reserves it for tagging obsolete datas, but
I can tell you this is an already lost battle.
This is a big problem for a draft RFC that will take around,
say, 15 months (;-)), to be completed.

Now, if we try to be a bit more clever, the locale that speaks
French and which labels monetary amounts in euros should be named
"fr_EU", for anything except very peculiar and very rare uses.
There are as much differences between France's French and Belgian
French as between Scottish English and London English (the most
notable being the use of "octante" instead of "quatre-vingt" for
eighty); and I believe the few other similar cases like "de_EU"
for "de_DE"/"de_AT", "nl_EU" for "nl_NL"/"nl_BE", and the perhaps
more future "en_EU"/"en_IE"/"en_GB" or "sv_EU"/"sv_FI"/"sv_SE".
Furthermore, the small countries and alike, as are "LU", "AD",
"SM", "MC" or "VC", for which independant locales will be quite
of jokes (I except "lb_LU"), will then be covered easily.

 
 5) Convert all non-human locales "C"  "POSIX" to human locales e.g. en_US.

There are BIG differences between "C"/"POSIX" and "en_US".
If you do not see that, then I believe there are big holes
in the intended uses of these new locales.

A major one is that "POSIX" collates in the same order as ASCII;
while I do not believe you are willingful to impose this burden
on every user of "en_US"!
The whole point of "C" and "POSIX" (or its grand'brother "i18n"),
as locales, are to provide surety in execution in an area where
fuzziness is the rule. And yes, there are cases where this is
much more important than displaying user-friendly dates...

Furthermore, I am not sure at all that mapping "C" to "en_US" will
be welcome everywhere (even if C99 now insists that the names
used in full text dates are the English ones). I am not even sure
this is conforming, even assuming the _classical_ "en_US" where
accentuated characters are considered punctuation.
In any ways, the modern, Unicode-conformant, definition of "en_US"
will certainly not qualify.


Antoine



Re: New Locale Proposal

2000-09-18 Thread Doug Ewell

Antoine Leca [EMAIL PROTECTED] wrote:

 1) Normalize to single form when possible.  Use ISO 639-1 code
 instead of 639-2 if one exists.

 Are you forced to re-tag every bit of data when ISO 639/RA issues a
 new code?

From what I have heard, ISO 639/MA will not be issuing any new 639-1
(two-letter) codes for languages that already have a 639-2 (three-
letter) code.  So this re-tagging scenario should not occur and Carl's
solution, which is the same as that proposed in RFC 1766 bis, should
work fine.

-Doug Ewell
 Fullerton, California



Re: http://www.unicode.org/unicode/standard/standard.html

2000-09-18 Thread Mark Davis

The paragraph reads:

The most _current major version of the Unicode
   standard_ contains 49,194 distinct coded characters.
   These characters cover the principal written
   languages of the Americas, Europe, the Middle East,
   Africa, India, Asia, and the Pacific Basin. Certain
   technical reports are approved and part of 3.0.  For a
   list, see the _Technical Reports_ page. Note that
   Version 3.0 is amended by an update version,
   _Unicode 3.0.1_.

so the note is at the bottom ("_" indicates a link).

I am not criticizing your message -- if you missed it, certainly a lot of
other people will! We will have to rewrite the paragraph for clarity.

Otto Stolz wrote:

 Hello,

 I had written:
  I have re-read section "Controlling Ligatures", in TUC 3.0, p. 318.

 Am 2000-09-15 um 14:40 UCT hat Mark Davis geschrieben:
  I'd like to remind everyone to look at the latest version of the Unicode
  Standard, especially when looking at fine points. To cite Unicode 3.0.1
  (http://www.unicode.org/unicode/standard/versions/Unicode3.0.1.html)

 Thank you for this pointer.

 Actually, I had looked for this information before I wrote my previous
 note, but could not find it; hence, I resorted to TUS 3.0.

 The reason for my failure to locate 3.01 on the Unocode WWW site is an
 improper organisation of the latter: I followed this trail:
 - http://www.unicode.org/: About the Unicode/Standard
   - http://www.unicode.org/unicode/standard/standard.html:
 Version 3.0
 The most current major version of the Unicode standard...
 Here ended my search. It did not occurr to me that I would find an even
 more current version of the Unicode standard under the "Versions of the
 Unicode Standard" link.

 So please make sure that the wording on your "Unicode Standard" page is
 not misleading, and that there is a conspicuous link to the current
 standard, even if it is not a major version.

 Thank you, and best wishes,
Otto Stolz




Re: [idn] nameprep forbidden characters

2000-09-18 Thread Mark H. David

From: "Martin J. Duerst" [EMAIL PROTECTED]
 However, that doesn't mean that the best solution is to ignore points
 on the client side. For example, Yiddish uses pointed letters in quite
 a bit a different way; they cannot be ignored. The same may apply to
 other languages written with the Hebrew script. There may also be cases
 where a point can indeed make a difference.

The points can and should be ignored for Internet domain name resolution.  
They cannot be ignored in all cases for all Hebrew-script
applications anyone can come up with now or in the future, either
for Hebrew or Yiddish or Ladino or whatever. Similarly, Letter case cannot 
be ignored for Latin-script languages either in every application possible.  But I feel
as a Yiddish expert that they can and should be ignored for the application
in question, Internet domain name resolution.  I claim this applies very
much to Ladino, though I can't claim expertise in that language, only superficial
knowledge.  Other Jewish languages (using Hebrew script) are archaic and 
are not in use, or hardly in use, by living speakers.





Re: New Locale Proposal

2000-09-18 Thread Antoine Leca

Doug Ewell wrote:
 
 Antoine Leca [EMAIL PROTECTED] wrote:
 
  1) Normalize to single form when possible.  Use ISO 639-1 code
  instead of 639-2 if one exists.
 
  Are you forced to re-tag every bit of data when ISO 639/RA issues a
  new code?
 
 From what I have heard, ISO 639/MA will not be issuing any new 639-1
 (two-letter) codes for languages that already have a 639-2 (three-
 letter) code.  So this re-tagging scenario should not occur and Carl's
 solution, which is the same as that proposed in RFC 1766 bis, should
 work fine.

I *should* have missed something.

In the last publication of new codes, there was "bs" for "Bosnian".
My understanding of the situation of the former Yugoslavia is that
the language which is intended to be tagged is a form of Serbo-Croatian
that is spoken in the country named Bosnia-Herzegovina (not sure
about Herzegovina), and outside this country by the natives or relatives
of natives of this very country.
Now this language is not a sudden invention: it was known before. And
as I understand things, this language was tagged "hr-XX-Bosnian", or
something like that (XX being the relevant country of the speaker).
So now the (probably fictious) document is supposed to be re-tagged
as "bs-XX". Or have I missed something?

Another example: a text in Avestan was, before the last change, tagged as
"x-Avestan" or "x-Avesta" or "x-zend" or a number of others, according to
the tagger. Now, should they be re-tagged? (and don't miss me, that will
be certainly a Good Thing; remember, the question is about the requirement).


Antoine



Re: New Locale Proposal

2000-09-18 Thread Marion Gunn

The opposite it true, Doug. ISO 639 will ONLY issue new 639-1
(two-letter) codes for languages that already have a 639-2 (three-letter)
code. That means, in effect, that the ISO 639-1/MA  (AT InfoTerm) has its
hands tied: it can no longer register any new lanugage tag identifiers
for languages not already approved by the ISO 639-2/MA (US Library of
Congress).
mg

Arsa Doug Ewell:

 Antoine Leca [EMAIL PROTECTED] wrote:

  1) Normalize to single form when possible.  Use ISO 639-1 code
  instead of 639-2 if one exists.
 
  Are you forced to re-tag every bit of data when ISO 639/RA issues a
  new code?

 From what I have heard, ISO 639/MA will not be issuing any new 639-1
 (two-letter) codes for languages that already have a 639-2 (three-
 letter) code.  So this re-tagging scenario should not occur and Carl's
 solution, which is the same as that proposed in RFC 1766 bis, should
 work fine.

 -Doug Ewell
  Fullerton, California




--
Marion Gunn
Everson Gunn Teoranta
http://www.egt.ie





Re: New Locale Proposal

2000-09-18 Thread Antoine Leca

Carl W. Brown wrote:
 
 I am sorry that my previous reply was so short, I was rushed.  A bit of
 background:

AH...

I am sorry, I missed entirely your point on the first shot. I was believing
you intended a new design on the locale issue.

You can easily drop my comments, they usually do not apply to the problem you
are referring yourselves. I apologize for the confusion I caused.


Since I have to send this message, here are a few more comments on your notes.
Mostly for fun...

 From: Antoine Leca [mailto:[EMAIL PROTECTED]]
 Carl W. Brown wrote:
 
  The locale will consist of three parts:
 
  1) A modified lower case RFC 1766bis language
 
  2) An ISO 3166 country code
 
 Can you allow for areas that are a little bigger ?
 The first obvious case is the EU (but I believe it may soon become a
 ISO 3166 code). Problematic cases also include the Arabic countries
 and the Spanish America, where the unity of language conjugated with
 the differences in countries create a long list of almots completely
 virtual locales (that is, outside the need to tag monetary amounts,
 these locales are non-informative). Same problem for French in
 Africa and, to a lesser extend, English on wide areas on Earth.
 
 Good point.  A combined South American Spanish is also a good starting point
 for a neutral Spanish dialect.  I guess you can always use a 5-8 character
 language variant.

I guess this too, but I believe(d) standardization in this area may help.
Alas, this does not appear as the way we go.

As an European, I assume you meant "a neutral Hispanoamerican" above, i.e.
want to dissociate European Spanish from Hispanoamerican (note to non-Spanish
speakers: this holds a lot of sense).
"Neutral Spanish" already have a locale code, "es", no need here.


 On the other hand for a language like Portuguese you might want to use
 Brazilian Portuguese from Minas Gerais as a language neutral.  This might be
 a case for your ISO 3166-2 codes Brazil is the major producer to T.V. and
 movies and influences the Portuguese language. 

Sounds OK as far as I know, but I do not know Brazil's linguistic situation!

 I guess it is like taking
 California English as a standard, maybe resented but generally understood.

But this one is going funny. Here in France, "Californian English" (which we
usually call West Coast American English) is taken as the prototypical example
of the hard-to-understand American way of talking. Of course, persons in
contact with "real" Americans people know about Arizona or Texas or Ebonics
(no offence intended; insert you case here:---) or X...; but the symbol is
represented with the "West Coast"...


Antoine



Re: New Locale Proposal

2000-09-18 Thread John Cowan

Marion Gunn wrote:
 
 The opposite it true, Doug. ISO 639 will ONLY issue new 639-1
 (two-letter) codes for languages that already have a 639-2 (three-letter)
 code. 

Almost, but not quite.  If that were true, 639-2 tags could become effectively
obsolete.  The true rules AFAIU are:

1) A language with a 639-1 tag has and will always have a 639-2 tag as well.
E.g. English has tags "eng" and "en".

2) A language which currently has a 639-2 tag but not a 639-1 tag will not
get a new 639-1 tag in future.  E.g. Arapaho has tag "arp" but will never
have a 639-1 tag.

3) Therefore, the only future 639-1 tags are those assigned to new (i.e.
not in 639-2) languages, simultaneously with a 639-2 tag.  E.g. Lojban,
a currently untagged language, might get the tags "loj" and "lj".
(When Hell freezes over.)

-- 
There is / one art   || John Cowan [EMAIL PROTECTED]
no more / no less|| http://www.reutershealth.com
to do / all things   || http://www.ccil.org/~cowan
with art- / lessness \\ -- Piet Hein



Re: New Locale Proposal

2000-09-18 Thread Marion Gunn

Absolutely true, John, and said far more succinctly than I did.

The most significant aspect of this is that the work of registering codes should
be procesed much faster in future, becuause, although for now there may still
exist two separate Maintenance Agencies to process requests, aplicants applying
to AT InfoTerm for 639-1 codes have, in future, as you say below, simultaneously
to satisfy 639-2 US LOC requirements.
mg

Arsa John Cowan:

 Marion Gunn wrote:
 
  The opposite it true, Doug. ISO 639 will ONLY issue new 639-1
  (two-letter) codes for languages that already have a 639-2 (three-letter)
  code.

 Almost, but not quite.  If that were true, 639-2 tags could become effectively
 obsolete.  The true rules AFAIU are:

 1) A language with a 639-1 tag has and will always have a 639-2 tag as well.
 E.g. English has tags "eng" and "en".

 2) A language which currently has a 639-2 tag but not a 639-1 tag will not
 get a new 639-1 tag in future.  E.g. Arapaho has tag "arp" but will never
 have a 639-1 tag.

 3) Therefore, the only future 639-1 tags are those assigned to new (i.e.
 not in 639-2) languages, simultaneously with a 639-2 tag.  E.g. Lojban,
 a currently untagged language, might get the tags "loj" and "lj".
 (When Hell freezes over.)

 --
 There is / one art   || John Cowan [EMAIL PROTECTED]
 no more / no less|| http://www.reutershealth.com
 to do / all things   || http://www.ccil.org/~cowan
 with art- / lessness \\ -- Piet Hein




--
Marion Gunn
Everson Gunn Teoranta
http://www.egt.ie





This is not UniLocale!

2000-09-18 Thread Ayers, Mike


Isn't there a more appropriate forum for the localization issues?  I
might even subscribe.  However, let's please move the topic to a more
appropriate place and let character encoding issues comprise at least half
the traffic around here.

Thanks,


/"\/|/|ike /+yers 
\ / ASCII Ribbon Campaign
 X  Against HTML Mail   Test Engineer
/ \BMC Software, Inc.



Re: This is not UniLocale!

2000-09-18 Thread Michael \(michka\) Kaplan

A noble thought, Mike. But how exactly would you suggest legislating the
feeling of what is important in the minds of others?

My overall impression is that people ask here because they are looking for
the slant that they would get from this group. And lets face it... if there
were not other locales, there probably would not be other languages, or
other scripts. And then there would be no need for Unicode.:-)

Willie Sutton was once misquoted when asked why he robbed banks (the claim
was that he said "thats where the money is"). This is where the languages
are

michka

a new book on internationalization in VB at
http://www.i18nWithVB.com/

- Original Message -
From: Ayers; "Mike" [EMAIL PROTECTED]
To: "Unicode List" [EMAIL PROTECTED]
Sent: Monday, September 18, 2000 4:13 PM
Subject: This is not UniLocale!



 Isn't there a more appropriate forum for the localization issues?  I
 might even subscribe.  However, let's please move the topic to a more
 appropriate place and let character encoding issues comprise at least half
 the traffic around here.

 Thanks,


 /"\/|/|ike /+yers
 \ / ASCII Ribbon Campaign
  X  Against HTML Mail   Test Engineer
 / \BMC Software, Inc.





RE: New Locale Proposal

2000-09-18 Thread Carl W. Brown

 Antoine wrote:

As an European, I assume you meant "a neutral Hispanoamerican" above, i.e.
want to dissociate European Spanish from Hispanoamerican (note to
non-Spanish
speakers: this holds a lot of sense).
"Neutral Spanish" already have a locale code, "es", no need here.

you are right. I don't consider Mexican Spanish very neutral either.

But this one is going funny. Here in France, "Californian English" (which
we
usually call West Coast American English) is taken as the prototypical
example
of the hard-to-understand American way of talking.

However because Europeans are better trained in languages they can tolorate
accent differences better.  Try to send a Midwestern American to New Zealand
for example.

If you want funny, my college roommate was a French major from Georgia who
spoke French with a heavy Southern American accent.

As to your other messaged about "hr-XX-Bosnian" to "bs-XX". This would have
first been converted to hr-bosnian_XX to keep the language variant together
with the language and before the country.  Then it would be converted to
bs_XX when the new standard was implemented.

Carl





This is not UniLocale!

2000-09-18 Thread Doug Ewell

Mike Ayers [EMAIL PROTECTED] wrote:

   Isn't there a more appropriate forum for the localization
 issues?  I might even subscribe.  However, let's please move the
 topic to a more appropriate place and let character encoding issues
 comprise at least half the traffic around here.

For my part, I started and have continued the language tag discussion
because of interest in Unicode's (discouraged) Plane 14 language tags.
The spillover discussion on RFC 1766 bis and Ethnologue tags is still
interesting to me (although the POSIX locale discussion is not).

I apologize to those who are getting sick of all this.  I agree that
some more genuine Unicode topics would be welcome, though I have none
to contribute at present.

 /"\
 \ / ASCII Ribbon Campaign
  X  Against HTML Mail
 / \

A noble cause, and one I support -- however, no more noble than the
Great Crusade to Stamp Out UTF-7 Mail Headers.

-Doug Ewell
 Fullerton, California



Re: This is not UniLocale!

2000-09-18 Thread David Starner

On Mon, Sep 18, 2000 at 07:24:30PM -0800, Doug Ewell wrote:
 A noble cause, and one I support -- however, no more noble than the
 Great Crusade to Stamp Out UTF-7 Mail Headers.

(And this is on list) why? What's so evil about UTF-7 mail headers? What about 
them would make them non-trivial for any Unicode-compliant mailer to handle?

-- 
David Starner - [EMAIL PROTECTED]
http/ftp: dvdeug.dhis.org
And crawling, on the planet's face, some insects called the human race.
Lost in space, lost in time, and meaning.
-- RHPS