Re: UTF8 vs. Unicode (UTF16) in code

2001-03-09 Thread Yves Arrouye
> > Since the U in UTF stands for Unicode, UTF-32 cannot > represent more than > > what Unicode encodes, which is is 1+ million code points. > Otherwise, you're > > talking about UCS-4. But I > > thought that one of the latest revs of ISO 10646 > explicitely specified that > > UCS-4 will never enc

RE: UTF8 vs. Unicode (UTF16) in code

2001-03-09 Thread Thomas Chan
On Fri, 9 Mar 2001, Marco Cimarosti wrote: > It is not very clear to me what is included in Extension B: how is it > possible to know something more about it? Look at DUTR #27[1] (2001.2.23), section 10.1, and see if any of those sources are ones that contain characters that are important to you

RE: Unicode market acceptance

2001-03-09 Thread Peter_Constable
On 03/09/2001 12:30:52 PM "Richard, Francois M" wrote: >I sure would like to look at a comparison of these costs. Are they available >anywhere? In Multilingual Computing #36, there was a book review of the book "Translating Into Success: Cutting-Edge Strategies for Going Multilingual in a Globa

Citrix Metaframe and MFC Apps

2001-03-09 Thread Jones, Bob
Has anyone had success using Citrix metaframe with a Unicode enabled MFC app? Anyone tried to do this and had problems? I'm mostly interested in Version 1.8a, but any feedback would be appreciated. Thanks, Bob

Re: Final letters in Hebrew and Arabic

2001-03-09 Thread Jeff Guevin
Arabic letters, when alone, always appear in form 1 -- the form designed for just such a use. There is one medial form--for letters on both sides--and two other forms, one for when the letter is attached from the right, and one attached from the left. Arabic short vowels (which are like diacriti

Re: Final letters in Hebrew and Arabic

2001-03-09 Thread Roozbeh Pournader
On Fri, 9 Mar 2001, Nick NICHOLAS wrote: > (1) When a letter with a final variant appears alone --- say as a numeral, > or in discussion of the letter or phoneme --- does it under any > circumstances appear in its final form, or is it always medial? > > (2) Do diacritics --- vowel points, can

Final letters in Hebrew and Arabic

2001-03-09 Thread Nick NICHOLAS
Many thanks to all who responded on the Albanian alphabet. No script too obscure, eh? :-) Kudos! With regard to the recent discussion on Greek final sigmas, I have a couple of questions on the final forms of letters in Hebrew and Arabic, just for the sake of comparison. (1) When a letter with a

Re: UTF8 vs. Unicode (UTF16) in code

2001-03-09 Thread Ienup Sung
Well, C stadard library is every where including non-Unix systems in these days as a "Runtime" support. I do though know and experienced as a fact that back in and up to early 80s there was no such thing called mblen in all the systems but things had been changed quite a bit so many years ago. Als

Re: Unicode market acceptance

2001-03-09 Thread Tex Texin
Richard, Looks excellent! I'll use some of those ideas and Michka's cost comments. By the way, homepage.com told me they are quitting the hosting business so my pages will need to move again. I'll let you know where it goes. Richard, you are welcome to host a copy at your site. I'll make a note t

Re: UTF8 vs. Unicode (UTF16) in code

2001-03-09 Thread Antoine Leca
Ienup Sung wrote: > > Well, on the contrary to what you said, it is a very good option since you > don't have to know anything about what's inside the character bytes which > means by using the mblen/mbrlen, you can achieve codeset independent > programming that will support not only Unicode/UTF-

RE: UTF8 vs. Unicode (UTF16) in code

2001-03-09 Thread Ayers, Mike
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] > > On 03/09/2001 12:53:57 PM "Ayers, Mike" wrote: > > >Um... no. The UTF-32 CES can handle much more than the current > >space of the Unicode CCS. As far as I can tell, it's good > to go until we > >need more than 32 bits to represent

Re: Unicode market acceptance

2001-03-09 Thread Richard Cook
Tex Texin wrote: > > not the same as work for execs. The success of Unicode is obvious > to us (techies) is not clear to them. Tex, Recently looking at and talking about this http://i18n.homepage.com/UnicodeBenefits.html with some people, initiated and uninitiated, I quickly wrote this: http

Re: Unicode market acceptance

2001-03-09 Thread Michael \(michka\) Kaplan
Cathy Wissink (from the Windows division at Microsoft) will be discussing the issue at her presentation in Hong Kong, talking about the next version of Windows and Unicode (though I am not sure how much detail she will give in the presentation). There will be more detail in the paper, obviously, b

Re: UTF8 vs. Unicode (UTF16) in code

2001-03-09 Thread Ienup Sung
Well, on the contrary to what you said, it is a very good option since you don't have to know anything about what's inside the character bytes which means by using the mblen/mbrlen, you can achieve codeset independent programming that will support not only Unicode/UTF-8 but also any other major co

RE: UTF8 vs. Unicode (UTF16) in code

2001-03-09 Thread Peter_Constable
On 03/09/2001 12:53:57 PM "Ayers, Mike" wrote: >Um... no. The UTF-32 CES can handle much more than the current >space of the Unicode CCS. As far as I can tell, it's good to go until we >need more than 32 bits to represent the ACR. I'm actually surprised that >this comment was so misunders

Re: UTF8 vs. Unicode (UTF16) in code

2001-03-09 Thread Allan Chau
Yves Arrouye wrote: > > > On 03/08/2001 07:40:25 PM "Ayers, Mike" wrote: > > > > > > >If you really want to finish the job, there's always > > > UTF-32, which > > > >should do rather nicely until we meet the space aliens aith the > > > >4,293,853,186 character alphabet! > > > > > > Um... no.

Re: UTF8 vs. Unicode (UTF16) in code

2001-03-09 Thread Keld Jørn Simonsen
On Fri, Mar 09, 2001 at 10:56:30AM -0800, Yves Arrouye wrote: > > Since the U in UTF stands for Unicode, UTF-32 cannot represent more than > what Unicode encodes, which is is 1+ million code points. Otherwise, you're > talking about UCS-4. But I > thought that one of the latest revs of ISO 10646

RE: UTF8 vs. Unicode (UTF16) in code

2001-03-09 Thread Yves Arrouye
> > On 03/08/2001 07:40:25 PM "Ayers, Mike" wrote: > > > > >If you really want to finish the job, there's always > > UTF-32, which > > >should do rather nicely until we meet the space aliens aith the > > >4,293,853,186 character alphabet! > > > > Um... no. The 1,113,023 character alphabet (

RE: UTF8 vs. Unicode (UTF16) in code

2001-03-09 Thread Ayers, Mike
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] > > On 03/08/2001 07:40:25 PM "Ayers, Mike" wrote: > > >If you really want to finish the job, there's always > UTF-32, which > >should do rather nicely until we meet the space aliens aith the > >4,293,853,186 character alphabet! > > Um.

RE: Unicode market acceptance

2001-03-09 Thread Richard, Francois M
> -Original Message- > From: Michael (michka) Kaplan [mailto:[EMAIL PROTECTED]] > Sent: Friday, March 09, 2001 12:26 PM > To: Unicode List > Subject: Re: Unicode market acceptance > > > One of the most compelling arguments for those managers is > the financial > one... ease of support

Re: languages on Web (was Re: Unicode market acceptance)

2001-03-09 Thread Tex Texin
Try: http://www.glreach.com/globstats/index.php3 I am sure the coverage is not the extent you want Peter. Also from time to time I see forecasts of when some language will be the primary language of x% of internet users, or when x% of web pages will be in some language. Would be interesting to

Re: Romanche dash

2001-03-09 Thread Otto Stolz
Am 2001-03-06 um 15:28 h UTC hat Patrick Andries geschrieben: > More polysemy for the dash... To which I have remarked: > Particularly, as the Ladin orthography also features a hyphen > (Strich d'uniun), While browsing through Gian Paul Ganzoni "Grammatica Ladina: Grammatica sistematica dal ruma

Re: languages on Web (was Re: Unicode market acceptance)

2001-03-09 Thread James E. Agenbroad
On Fri, 9 Mar 2001 [EMAIL PROTECTED] wrote: > > On 03/09/2001 11:01:53 AM "Tex Texin" wrote: > > >We have estimates for (human) language usages on the web > > Do you mean the number of different languages used on the web? I'd be > curious to know what such estimates are. > > > > - Peter >

languages on Web (was Re: Unicode market acceptance)

2001-03-09 Thread Peter_Constable
On 03/09/2001 11:01:53 AM "Tex Texin" wrote: >We have estimates for (human) language usages on the web Do you mean the number of different languages used on the web? I'd be curious to know what such estimates are. - Peter

RE: UTF8 vs. Unicode (UTF16) in code

2001-03-09 Thread Marco Cimarosti
Thomas Chan wrote: > > Does it exist at least one character > U+ that is > > commonly used in at least one modern language? > > How about music and math notation? About the music symbols in Unicode 3.1, they are just the basic building blocks for it. So I assume that handling surrogates (or

Re: Unicode market acceptance

2001-03-09 Thread Michael \(michka\) Kaplan
One of the most compelling arguments for those managers is the financial one... ease of support for multiple languages. If you look at the cost of the multiple binary releases of a product like Win95 and compare it to the single EXE model, the issue is clear... and Unicode is the way to achieve th

RE: Globalization question.

2001-03-09 Thread Carl W. Brown
Michael, The great thing about most books is that they give you answers. The hard part of Globalization is asking the right questions. Knowing what questions to ask comes with years of experience. Because globalization must fit into the culture of the software development company if it going t

RE: Problem with MSIE 5.0 for Macintosh ( known bug?)

2001-03-09 Thread Carl W. Brown
Doug, Alan Wood also has a good site an Unicode and browser support. http://www.hclrss.demon.co.uk/unicode/ Carl -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Tuesday, March 06, 2001 7:55 PM To: Unicode List Subject: Re: Problem with MSIE 5.0 for Macintosh

Re: Unicode market acceptance

2001-03-09 Thread Tex Texin
Guys, I know the list of who's who using Unicode. Me too is not a compelling business argument. None of these put Unicode as the sole character set to use, so its simply another way to go. (OK, I know Java and XML please don't push back on these. Fundamentally, although they use Unicode I can als

RE: UTF8 vs. Unicode (UTF16) in code

2001-03-09 Thread Thomas Chan
On Fri, 9 Mar 2001, Marco Cimarosti wrote: > Addison P. Phillips wrote: > > [...] > > currently there are no characters "up there" this isn't a really big > > deal. Shortly, when Unicode 3.1 is official, there will be 40K or so > > characters in the supplemental planes... but they'll be > > rela

Re: Script of Elbasan and other Albanian Alphabets

2001-03-09 Thread Michael Everson
At 01:06 -0800 2001-03-09, J%ORG KNAPPEN wrote: >Having consulted my references: > >The discussion of the Albanian alphabets is in Jenssen, p 494 ff. >Haarmann has nothing about them, not even the pictures. See also Faulmann pp. 181-82. -- Michael Everson ** Everson Gunn Teoranta ** http://w

RE: UTF8 vs. Unicode (UTF16) in code

2001-03-09 Thread Peter_Constable
On 03/08/2001 07:40:25 PM "Ayers, Mike" wrote: >If you really want to finish the job, there's always UTF-32, which >should do rather nicely until we meet the space aliens aith the >4,293,853,186 character alphabet! Um... no. The 1,113,023 character alphabet (one more than the encodable scal

19th century "Unicode"

2001-03-09 Thread Jeff Guevin
A reference for all kinds of cryptographic info, including telegraphic code, is at http://www.codasaurus.com/. I haven't delved into the site much, but it looks promising. jeff

Re: UTF8 vs. Unicode (UTF16) in code

2001-03-09 Thread Antoine Leca
Ienup Sung wrote: > > I also implement UTF-16 and UTF-8 support in various levels and > I find UTF-8 is more easier to handle and write software with since > we have many MB functions, e.g., mblen() for byte length, that we can use, > and, there is no byte ordering hassle that we need to worry ab

RE: Unicode market acceptance

2001-03-09 Thread Marco Cimarosti
Pierpaolo BERNARDI wrote: > "It's what Microsoft uses" should work, methinks. Tex Texin wrote: > Not really. For one, many companies use platforms other than Windows. Then add "And it's also used in *Java*, *HTML*, *SQL Server*, and *Oracle*. Oh, by the way, and *IBM* has a *free* library to sup

RE: UTF8 vs. Unicode (UTF16) in code

2001-03-09 Thread Marco Cimarosti
Addison P. Phillips wrote: > [...] > currently there are no characters "up there" this isn't a really big > deal. Shortly, when Unicode 3.1 is official, there will be 40K or so > characters in the supplemental planes... but they'll be > relatively rare. This reminds me of a question that I wante

Script of Elbasan and other Albanian Alphabets

2001-03-09 Thread J%ORG KNAPPEN
Having consulted my references: The discussion of the Albanian alphabets is in Jenssen, p 494 ff. Haarmann has nothing about them, not even the pictures. --J"org Knappen