Re: Comments on FCD 5218, "Codes for the representation of human sexes"

2001-12-01 Thread Thierry Sourbier

Mike,

> In the column headed "Élément de donnée", the data items
> corresponding to the codes 1 and 2 are given as "Masculin"
> and "Féminin".  In the English version they are correctly
> given as "Male" and "Female".  Is it not the case that
> French maintains the same distinction between sex (of
> living organisms) and gender (of words) as English does?
> In that case, shouldn't the French words for "Male" and
> "Female" be shown as "Mâle" and "Femelle", respectively?

The translation "Masculin" and "Féminin" is correct. If the terms "Mâle" and
"Femelle" are indeed used in France to describe sex of living organisms they
are not used for human beings and would actually be found quite offending in
that context. I believe the situation is the same on the other side of the
Atlantic (unless Alain contradicts me :).

Regards,
Thierry.
<><><><><><><><><><><><><><><><><><><><><><>
www.i18ngurus.com - Open Internationalization Resources Directory





Re: Comments on FCD 5218, "Codes for the representation of human sexes"

2001-11-30 Thread Thierry Sourbier

> Otto Stolz  wrote:
> So, the accent circonflex is retained on â, ê, and ô, and
> its use on î and û has been restricted.

Restricted may be a little too strong, litteraly it says the circonflex
accent "won't be mandatory except where it is useful : ... " :).

Cheers,
Thierry

<><><><><><><><><><><><><><><><><><><><><><>
www.i18ngurus.com - Open Internationalization Resources Directory





Re: French uppercase accented letters (was: Re: Comments on FCD 5218)

2001-11-30 Thread Thierry Sourbier


If you consider that today all end up being a question of money, you can
notice that the French bank notes (soon to be extinct) use uppercase
accented letters (e.g. "FALSIFIÉ" in the "do-not-copy-me" notice ). Yet, as
it was previously noted, a great deal of confusion still exist today in
France regarding accents & uppercase.

An illustration can be found in "Le Monde" (one of the most respected French
newspaper): in www.lemonde.fr there is no accent on uppercase letters on the
front page. Yet, within the site some articles make use of uppercase
accented letters (e.g. "DÉBUT JUILLET 1995, ..."). In another French
newspaper (www.liberation.com) , the same inconsistencies can be noticed as
accents are used for the menus (e.g. "MULTIMÉDIA") but not for the headers
(e.g. "Economie").

An explanation could that computers were used in some occurences to change
the casing making it "right". Indeed typing uppercase capitalized letter is
a bit trickier than typing their lowercase counterpart (e.g. SHIFT + "é"
gives you a "2"...), which may explain the low usage today. (You can see the
French keyboard layout at
http://www.microsoft.com/globaldev/keyboards/keyboards.asp)

For those who need some statistics may be you could survey the web for the
various usage for the name "États-Unis" (= United States). Both Le Monde and
Libération use "Etat-Unis" despite what the dictionnaries say.

I found a bibliography in French on the subject of "uppercase & accents" but
I do not own any of the book mentionned so I could not verify what they say
(http://www.ccdmd.qc.ca/Sitedocu/f0020076.htm).

To comment on a previous remarks made in the thread:

> Alain LaBonté wrote:
> it is true that there has always been a usage for unaccented uppercase
> initials of sentences (or proper names), on both sides of the Atlantic
> indeed, and for consistent accentuation, regardless of case.

While I'm not disagreing with the previous comment, we can note that on
www.larouse.net, accents are used even on the first letter of the sentences
(e.g. "À la fois plate-forme de diffusion"). I could not find any
documentation confirming/restricting such a use. I don't even want to think
on how such a usage could be computerized :).

Cheers,
Thierry.

<><><><><><><><><><><><><><><><><><><><><><>
www.i18ngurus.com - Open Internationalization Resources Directory








Re: Displaying Negative Unicode on the Web

2001-11-21 Thread Thierry Sourbier

Jen,

Unicode does not define any "negative" values, that may be why you are
having trouble. The "-" sign is probably due to an error (typo? overflow?).

The Simplified Chinese character code point 3 exists, you can look at it
at:
http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=7987

Cheers,
Thierry.
<><><><><><><><><><><><><><><><><><><><><><>
www.i18ngurus.com - Open Internationalization Resources Directory


- Original Message -
From: "jen w" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Wednesday, November 21, 2001 11:10 AM
Subject: Displaying Negative Unicode on the Web


> Hi,
>
> Can anyone help me with displaying negative unicode values on the web?
>
> I got a value here, notice "&#-3; doest display as chinese character,
> I guess this has to do with the fact that it is a negative value?
>
> Ê&#-3;"|'Ý
>
> Do you guys know any alternative way on how to display a negative value
such
> as this one?
>
> Thank you very much,
> Jen
>
>
>
> _
> Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp
>
>
>





Re: Removal from list

2001-11-20 Thread Thierry Sourbier



Craig,
 
> It is not at all clear to me how to remove myself from 
this list.
 
Have you looked at the intructions in the unicode 
site?
http://www.unicode.org/unicode/consortium/distlist.html
 
You may also send a message to [EMAIL PROTECTED] and write "unsubscribe 
unicode" in the subject line. 
 
May be this last bit of info could be added to the list 
information page as the form to subscribe/unsubscribe may be tricky to use for 
people with several email accounts and/or webmail accounts.
 
Cheers,
Thierry.
 
 

  - Original Message - 
  From: 
  Craig Cameron 
  To: [EMAIL PROTECTED] 
  Sent: Tuesday, November 20, 2001 10:13 
  AM
  Subject: Removal from list
  
  It is not at all clear to me how to remove myself from this 
  list.
  Could someone please unsubscribe me from this 
  forumThanksCraig 
Cameron


Re: romajiToKana function

2001-11-16 Thread Thierry Sourbier


> I have before referred to a "romajiToKana" function, not knowing whether
> or not it existed. it seems it does exist.

A couple of Perl and C implementation exist. As I never used them I cannot
provide futher comment, but you'll find them at:

http://www.nic.funet.fi/pub/culture/japan/info/
http://www.srekcah.org/~utashiro/perl/scripts/romkan_pl/
http://packages.debian.org/unstable/interpreters/liblingua-romkan-perl.html

> Is there JavaScript for it?

I doubt it. JavaScript is not the best language in terms of String
manipulation.

Cheers,
Thierry.

<><><><><><><><><><><><><><><><><><><><><><>
www.i18ngurus.com - Open Internationalization Resources Directory





Re: ANSI to UTF-8 conversion

2001-10-26 Thread Thierry Sourbier

Marek,

> Please can somebody give me instructions how to transfer the text form
from
> the Example 1 to Example 2 ?

Here are some tools that can help you. Read their respective documentation
to get the intructions.

* Uniconv from Basis Technology (http://demos.basistech.com/unicode/)

* C-Kermit: http://www.columbia.edu/kermit/ckermit.html
   At the C-Kermit> prompt, simply type:
   translate  shift-jis utf8 

* Java's bundled Native2ASCII tool.

* RWS's Rainbow toolbox
(http://www.translate.com/locales/en-US/index.html?init_page=shared/tools/in
dex.html)

Regards,
Thierry

<><><><><><><><><><><><><><><><><><><><><><>
www.i18ngurus.com - Open Internationalization Resources Directory





Re: GB18030

2001-09-21 Thread Thierry Sourbier
Charlie,

> In what ways will this effect Unicode?
>
> Does it contain anything that Unicode doesn't?

I suggest that you take a look at Markus Scherer paper "GB 18030: A
mega-codepage"
http://www-106.ibm.com/developerworks/library/u-china.html

It will probably answer your question on the relationship between GB18030
and Unicode.

Cheers,
Thierry.

<><><><><><><><><><><><><><><><><><><><><><>
www.i18ngurus.com - Open Internationalization Resources Directory


Re: [OT] o-circumflex

2001-09-06 Thread Thierry Sourbier

> Is it a distinct grapheme, or is it considered a variant of "o"?

I would say it is a variant of "o" we just called it... "o with a circumflex
accent" ("o avec un accent circonflex"). The difference between "o" and "ô"
is normally audible (for a French speaker). The relationship is the same
than with any other letter which sometimes have accents (e.g. "a" and "à",
"e" and "è", etc.).

The only little thing to know about French and diacritical mark is that when
doing a sort diacritical mark are evaluated from right to left.  (e.g.
"cote" < "côte" < "coté" vs the English order "cote" <  "coté" < "côte" ).

I'm just talking as a French Francophone not a linguist. May be someone on
this list knows why diacritical marks are sorted in French in such a funky
way :).

Cheers,
Thierry

<><><><><><><><><><><><><><><><><><><><><><>
www.i18ngurus.com - Open Internationalization Resources Directory

- Original Message -
From: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Thursday, September 06, 2001 3:08 PM
Subject: [OT] o-circumflex



How do Francophones view the o-circumflex "ô" in relation to the letter "o"?
Is it a distinct grapheme, or is it considered a variant of "o"?


- Peter


---
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <[EMAIL PROTECTED]>





Re: Opentype support under Liunx

2001-08-21 Thread Thierry Sourbier


> > Mark Leisher had a page up detailing a set of BDF hints that would
> > allow a smart processor (i.e. Pango) to do appropriate glyph
> > substition.
>
> Pointer, please.

Mark Leisher site is at:
http://crl.nmsu.edu/~mleisher/

What David was refering to may have been Mark's paper on "Context analysis
for rendering with Unicode". Which is available at:
http://crl.nmsu.edu/~mleisher/download.html (look at the bottom of the
page).

Cheers,
Thierry.


www.i18ngurus.com - Open internationalization resources directory.







Re: Launch of www.i18ngurus.com

2001-07-25 Thread Thierry Sourbier

 > Well, the most prominent is probably that for an internationalization
> site, you'd best have a couple of international versions (French,
> German, possibly a couple of others) of the site itself :-)

I thought about this paradox too and given infinite time & resources I will
do it :). Since I have nothing to sell and enough work with the English
version, I decided that www.i18ngourous.fr will wait a bit.

The paradox seems actually widespread in the industry as, to my knowledge,
none of the main i18n books have been translated into other languages (CJKV,
Unicode standard, developing international Software, etc...). The audience
is, I bet, already very small as it is.

Cheers,
T.

PS: Thanks Philipp for your site suggestion!






Launch of www.i18ngurus.com

2001-07-25 Thread Thierry Sourbier

Hi all,

I recently released www.i18ngurus.com, a non-profit open i18n resources
directory. The site should help developers and project managers involved in
i18n projects finding useful information on the web (incl. Unicode resources
of course).

The directory currently contains over 490 entries and 100 categories but
this is only the tip of the iceberg. Do not hesitate to suggest/add links to
make the directory a little more complete.

Suggestions to improve the site are welcome.

Regards,
Thierry Sourbier
I18ngurus.com editor.