Taiwanese: unicode of o with dot right above

2000-08-11 Thread Kiatgak
Taiwanese(or Holooe) is a latin based language. I can not find the unicodes of the 2 base characters: "O/o with a dot right above", which are pronounced as "OPEN O"(open-mid back rounded vowel). Their outlook can be viewed at 4.2 of http://www.taiwanese.com/tp/tpsurvey/tpsurvey.pdf. For conform

First ICU Developer Workshop Meeting, September 2000, Cupertino, CA -- Register now

2000-08-11 Thread Helena Shih
First ICU Developer Workshop Meeting September 11-12, 2000 IBM Emerging Technology Center Cupertino, California, USA ** Unicode is

Re: Tigrinya, Tonga, Turkish, Yoruba (was: RFC 1766)

2000-08-11 Thread Mark Leisher
Peter> I don't agree that there isn't a problem: the idea that's applied Peter> in RFC1766 (at least, in the draft for the next version) is that Peter> (a) precedence is given first to ISO 639-1 then to ISO 639-2, and Peter> (b) you use the most specific tag that is appropriate to

Re: Tigrinya, Tonga, Turkish, Yoruba (was: RFC 1766)

2000-08-11 Thread Peter_Constable
On 08/11/2000 10:48:55 AM Mark Leisher wrote: >>> In the case of Tonga, there may perhaps be a legitimate doubt whether >>> to equate >>> the "to" (Tonga) code with "tog" (Tonga (Nyasa)) or with "ton" (Tonga >>> (Tonga Islands)), or with both. > >Peter> This relates to one of

IANA charset registration for SCSU

2000-08-11 Thread Markus Scherer
Hello, I proposed SCSU (as described in UTR 6) for registration as a charset with IANA (as "SCSU" with no aliases). Good news: it was approved on 2000-jul-19. Bad news: The publication of IANA registrations is currently being redesigned and re-staffed, therefore nothing has been and will be p

Re: Tigrinya, Tonga, Turkish, Yoruba (was: RFC 1766)

2000-08-11 Thread Mark Leisher
>> In the case of Tonga, there may perhaps be a legitimate doubt whether >> to equate >> the "to" (Tonga) code with "tog" (Tonga (Nyasa)) or with "ton" (Tonga >> (Tonga Islands)), or with both. Peter> This relates to one of the problems with ISO 639-x that I'll be Peter>

Re: Tigrinya, Tonga, Turkish, Yoruba (was: RFC 1766)

2000-08-11 Thread Peter_Constable
On 08/11/2000 09:55:57 AM John Cowan wrote: >The problem is in fact more serious. The four languages Tigrinya (ti), Tonga >(to), >Turkish (tr), and Yoruba (yo) do not have their ISO 639-1 codes listed... >I do not have the paper ISO 639-1, I do have the paper version of ISO 639:1998, and it d

Tigrinya, Tonga, Turkish, Yoruba (was: RFC 1766)

2000-08-11 Thread John Cowan
Doug Ewell wrote: > They are not listed in the HTML page on Michael Everson's site, which > claims to be "complete and up-to-date as of 2000-02-19." But they are > listed in a text file, "Technical contents of ISO 639:1988," originally > typed by Keld Simonsen, which I had ignored in favor of th

RE: codepages on Windows

2000-08-11 Thread Peter_Constable
On 08/11/2000 08:50:13 AM Murray Sargent wrote: >The table is pretty easy... The only concern with maintaining a table is knowing whether it's complete and up to date. Unlike, say, LANGIDs, the MSDN library doesn't have a page that lists these contstants. If you find those that are mentioned un

Re: codepages on Windows

2000-08-11 Thread Michael \(michka\) Kaplan
Actually, the Platform SDK docs are really inadequate here. Although the wparam is listed as "Specifies the character set of the new locale" it is impossible to get past the fact that WM_INPUTLANGCHANGEREQUEST talks about INPUTLANGCHANGE_SYSCHARSET as meaning "The new input locale's keyboard layou

Re: RFC 1766

2000-08-11 Thread Peter_Constable
On 08/11/2000 08:13:54 AM Doug Ewell wrote: >They are not listed in the HTML page on Michael Everson's site, which >claims to be "complete and up-to-date as of 2000-02-19." But they are >listed in a text file, "Technical contents of ISO 639:1988," originally >typed by Keld Simonsen, wich I had

Re: codepages on Windows

2000-08-11 Thread Peter_Constable
>wParam of WM_INPUTLANGCHANGE *is* the codepage ID (that you can pass >to MultiByteToWideChar(), for example). I believe wParam gives a charset id, not a codepage id. There's a difference. I wasn't sure how one gets codepage from charset (short of maintaining a table). Unfortunately, the MSDN

Re: RFC 1766

2000-08-11 Thread Peter_Constable
>Another problem is that RFCs are not necessarily written with the same >attention to detail, precision, and completeness as ISO or national >standards. Some are written very well indeed, but there are no >guarantees. The present problem with imprecise wording in RFC 1766 is >evidence of this.

Re: RFC 1766

2000-08-11 Thread Doug Ewell
Antoine Leca <[EMAIL PROTECTED]> wrote: >> This means that some major, significant languages like Turkish and >> Yoruba will never get two-letter codes, which seems odd somehow. > > Turkish is tr, and Yoruba is yo, since the beginning. What is the > point? They are not listed in the HTML page o

Re: codepages on Windows

2000-08-11 Thread Jeu George
You can convert UTF8 characters to UNICODE using MultiByteToWideChar(CP_UTF8,1,utf8string,-1,unicodestring,size_of_string); The UTF-8 codepage will be passed to the fucntion.

Re: codepages on Windows

2000-08-11 Thread Torsten Mohrin
[EMAIL PROTECTED] wrote: >Anybody happen to know: Is there no Win32 API that allows you to determine >a codepage given a LANGID or a charset value (i.e. one of the two >parameters provided by WM_INPUTLANGCHANGE)? wParam of WM_INPUTLANGCHANGE *is* the codepage ID (that you can pass to MultiByteTo

Re: Windows codepages

2000-08-11 Thread Bob Hallissy
>Or, alternately, that takes a charset value and returns a codepage? try TranslateCharsetInfo() Bob