Re: Non-ascii string processing?

2003-10-06 Thread Doug Ewell
Jill Ramonsky wrote: > But then, a default grapheme cluster might theoretically require up to > 16 Unicode characters. (Maybe more, I don't know). Even bit-packed to > 21 bits per character, that still gives us 336 bits. So I conclude > that our string processing functions could go a lot faster i

Macintosh Keyboard Layout Creator

2003-10-06 Thread mjabbar
I am looking for a Keyboard Layout Creator for Roman Unicode Character for Mac OS X. Can anyone suggest me? Thanks and regards Mustafa Jabbar - This mail sent through bangla.net, The First Online Internet Service Provider In Bangladesh

Re: Unicode Public Review Issues update: BRAILLE

2003-10-06 Thread Christopher John Fynn
- Original Message - From: "Jony Rosenne" <[EMAIL PROTECTED]> > Please note that Braille is used also for Hebrew. We use the same codes, but > they are assigned a different meaning. The reader has to know or guess which > language it is. > > I don't remember whether Hebrew Braille is writ

Re: Unicode Public Review Issues update

2003-10-06 Thread Rick McGowan
Florian Weimer asked: > > http://www.unicode.org/review/ > > Maybe I'm missing something, but I still can't find any reference that > the Unihan.txt file will be released under a license that permits > redistribution (which has been announced in other documents). Ah, you're right. It will hav

Re: Byzantine musical notation

2003-10-06 Thread Kenneth Whistler
Nick Nicholas asked: > (Resend) > > I can't really find the answer to this question online, especially > because the proposal documents for it don't seem to have been posted to > anubis.dkuug.dk. Furthermore, this is not actually an area I know > anything about. :-) So: > > Byzantine musical

Byzantine musical notation

2003-10-06 Thread Nick Nicholas
(Resend) I can't really find the answer to this question online, especially because the proposal documents for it don't seem to have been posted to anubis.dkuug.dk. Furthermore, this is not actually an area I know anything about. :-) So: Byzantine musical notation consisted of three stages, Mi

RE: Unicode Public Review Issues update

2003-10-06 Thread Jony Rosenne
Please note that Braille is used also for Hebrew. We use the same codes, but they are assigned a different meaning. The reader has to know or guess which language it is. I don't remember whether Hebrew Braille is written RTL or LTR. Jony > -Original Message- > From: [EMAIL PROTECTED] >

Re: Non-ascii string processing?

2003-10-06 Thread Edward H. Trager
On Monday 2003.10.06 21:36:13 +0200, Marco Cimarosti wrote: > Edward H. Trager wrote: > > > But I still don't see any use in knowing how many > > characters are in an UTF-8 > > > string, apart the use that I already mentioned: allocating > > a buffer for a > > > UTF-8 to UTF-32 conversion. > > >

FW: Web Form: Other Question: IPA download?

2003-10-06 Thread Magda Danish \(Unicode\)
> -Original Message- > Date/Time:Mon Oct 6 13:56:18 EDT 2003 > Contact: [EMAIL PROTECTED] > Report Type: Other Question, Problem, or Feedback > > Dear Madam/Sir, > > I am currently looking for an International Phonetic Alphabet > download/plugin for Flash and do not know wh

RE: Non-ascii string processing?

2003-10-06 Thread Marco Cimarosti
Edward H. Trager wrote: > > But I still don't see any use in knowing how many > characters are in an UTF-8 > > string, apart the use that I already mentioned: allocating > a buffer for a > > UTF-8 to UTF-32 conversion. > > Well, I know a good use for it: a console or terminal-based > applicatio

Re: Unicode Public Review Issues update

2003-10-06 Thread Florian Weimer
Rick McGowan wrote: > The Unicode Technical Committee has posted some new issues for public > review and comment. Details are on the following web page: > > http://www.unicode.org/review/ Maybe I'm missing something, but I still can't find any reference that the Unihan.txt file will be r

Re: Unicode Public Review Issues update

2003-10-06 Thread Asmus Freytag
At 10:29 AM 10/6/03 +0530, [EMAIL PROTECTED] wrote: > The Unicode Technical Committee has posted some new issues for public > review and comment. Details are on the following web page: > > http://www.unicode.org/review/ A question about the issues already open: What is the justification for

RE: Non-ascii string processing?

2003-10-06 Thread Jill Ramonsky
Could you try that again with codepoints > U+ please? I'd be curious to know what happens. Jill > -Original Message- > From: John Delacour [mailto:[EMAIL PROTECTED] > Sent: Monday, October 06, 2003 2:15 PM > To: [EMAIL PROTECTED] > Subject: RE: Non-ascii string processing? > > > At 12

RE: Non-ascii string processing?

2003-10-06 Thread Jill Ramonsky
Nor I. "Characters" are perhaps the most useless objects ever invented. Now - a count of DEFAULT GRAPHEME CLUSTERs might be useful (for example, for display on a console which uses fixed-width fonts). Indeed, a whole class of DEFAULT GRAPHEME CLUSTER handling functions might come in very handy

Re: Non-ascii string processing?

2003-10-06 Thread Edward H. Trager
On Monday 2003.10.06 17:15:25 +0200, Marco Cimarosti wrote: > Stephane Bortzmeyer wrote: > > > OK. But the length in "characters" of a string is not > > "character semantics": > > > it's plain nonsense, IMHO. > > > > I disagree. > > Feel free. > > But I still don't see any use in knowing how ma

RE: Non-ascii string processing?

2003-10-06 Thread jon
> But I still don't see any use in knowing how many characters are in an UTF-8 > string, apart the use that I already mentioned: allocating a buffer for a > UTF-8 to UTF-32 conversion. I wouldn't use it for that at all. I'd assume a worse-case of 32-bit word in the UTF-32 per octet in the UTF-8 o

RE: Non-ascii string processing?

2003-10-06 Thread Marco Cimarosti
Stephane Bortzmeyer wrote: > > OK. But the length in "characters" of a string is not > "character semantics": > > it's plain nonsense, IMHO. > > I disagree. Feel free. But I still don't see any use in knowing how many characters are in an UTF-8 string, apart the use that I already mentioned: al

Re: Non-ascii string processing?

2003-10-06 Thread jon
> > a word like "élite" is always counted as five characters, > regardless > > that it might be encoded as six Unicode "characters". > > I assume that everybody on this list knows that you count characters > only after a proper normalization... (like many operations on Unicode > texts). A word li

RE: Non-ascii string processing?

2003-10-06 Thread John Delacour
At 12:09 pm +0200 6/10/03, Marco Cimarosti wrote: What strlen() cannot do is countîng the number of *characters* in a string. But who cares? I can imagine very few situations where someone such an information would be useful. #!/usr/bin/perl print "ab, \x{}\x{aaab}" ; printf "\n%s, %s", le

Re: Non-ascii string processing?

2003-10-06 Thread 'Stephane Bortzmeyer'
On Mon, Oct 06, 2003 at 01:52:26PM +0200, Marco Cimarosti <[EMAIL PROTECTED]> wrote a message of 51 lines which said: > a word like "élite" is always counted as five characters, regardless > that it might be encoded as six Unicode "characters". I assume that everybody on this list knows that y

Re: Non-ascii string processing?

2003-10-06 Thread jon
> > If you really aren't processing anything but the ASCII characters > > within > > your strings, like "<" and ">" in your example, > you can probably get > > away with keeping your existing byte-oriented code. At least you won't > > get false matches on the ASCII characters (this was a primary

Re: Non-ascii string processing?

2003-10-06 Thread Peter Kirk
On 06/10/2003 03:09, Marco Cimarosti wrote: Doug Ewell wrote: Depends on what "processing" you are talking about. Just to cite the most obvious case, passing a non-ASCII, UTF-8 string to byte-oriented strlen() will fail dramatically. Why? The purpose of strlen() is counting the number of

RE: Non-ascii string processing?

2003-10-06 Thread Marco Cimarosti
Stephane Bortzmeyer wrote: > On Mon, Oct 06, 2003 at 12:09:34PM +0200, > Marco Cimarosti <[EMAIL PROTECTED]> wrote > a message of 14 lines which said: > > > What strlen() cannot do is countîng the number of > *characters* in a string. > > But who cares? I can imagine very few situations where

Re: Non-ascii string processing?

2003-10-06 Thread Stephane Bortzmeyer
On Mon, Oct 06, 2003 at 12:09:34PM +0200, Marco Cimarosti <[EMAIL PROTECTED]> wrote a message of 14 lines which said: > What strlen() cannot do is countîng the number of *characters* in a string. > But who cares? I can imagine very few situations where someone such an > information would be use

RE: Non-ascii string processing?

2003-10-06 Thread Marco Cimarosti
Theodore H. Smith wrote: > Hi lists, Hi, member. > I'm wondering how people tend to do their non-ascii string processing. I think no one has been doing ASCII string processing for decades. :-) But I guess you meant non-SBCS ("single byte character set") string processing. > [...] > So, I'm wond

RE: Non-ascii string processing?

2003-10-06 Thread Marco Cimarosti
Doug Ewell wrote: > Depends on what "processing" you are talking about. Just to cite the > most obvious case, passing a non-ASCII, UTF-8 string to byte-oriented > strlen() will fail dramatically. Why? The purpose of strlen() is counting the number of *bytes* needed to store a certain string, and

Re: Unicode Public Review Issues update

2003-10-06 Thread jon
> The Unicode Technical Committee has posted some new issues for public > review and comment. Details are on the following web page: > > http://www.unicode.org/review/ A question about the issues already open: What is the justification for proposing to make Braille Lo?