Re: sorting order between win98/xp

2003-03-12 Thread Michael \(michka\) Kaplan
From: "Doug Ewell" <[EMAIL PROTECTED]> > Note that I'm speaking in terms of programmable sorting. I really don't > care how filenames in Windows Explorer are sorted. Sigh I wonder if my mail made it to the list? Doug, Frank was wrong (or arther his colleague was wrong). CompareString does

Re: sorting order between win98/xp

2003-03-12 Thread Doug Ewell
Dominikus Scherkl wrote: >> It is not deterministic string ordering > ?!? > What's non-deterministic in numeric ordering? > Ok, mix of (letter-)strings and numbers maybe not so > straight-forward to sort than simply sorting digits > by their encoding-value (this is the cause it was > not implemen

farsi calendar components

2003-03-12 Thread Paul Hastings
does anybody know of any java farsi calendar components? thanks. Paul Hastings [EMAIL PROTECTED] CTO Sustainable Development Research Institute Member Team Macromedia (ColdFusion)

Re: What are provisional properties

2003-03-12 Thread Asmus Freytag
At 11:55 AM 3/13/03 +0900, you wrote: Dear Unicoders, The unicode beta page mentions that a new concept of "provisional properties" has been introduced to 4.0. Unfortunately, no text is available that elaborates this. Is there any way to learn more about that prior to publication of TUS 4.0?

What are provisional properties

2003-03-12 Thread Christian Wittern
Dear Unicoders, The unicode beta page mentions that a new concept of "provisional properties" has been introduced to 4.0. Unfortunately, no text is available that elaborates this. Is there any way to learn more about that prior to publication of TUS 4.0? All the best, Christian Wittern --

Re: Allocation of Georgian Extended block

2003-03-12 Thread John Cowan
Kenneth Whistler scripsit: > The reason is that the Myanmar block was given four empty columns > because we already *know* of numerous characters that will need > to be added to the Myanmar script to support Shan, Karen, Mon, > and other minority languages written with the script. And which, unl

Re: Allocation of Georgian Extended block

2003-03-12 Thread Kenneth Whistler
The reason is that the Myanmar block was given four empty columns because we already *know* of numerous characters that will need to be added to the Myanmar script to support Shan, Karen, Mon, and other minority languages written with the script. Ending the Myanmar block at U+109F (instead of U+105

Re: Ligatures fj etc (from Re: Ligatures (qj) )

2003-03-12 Thread John Hudson
William Overington wrote: > I have added a new code recently, which is U+E700 STAFF which is a vertical > line from the very top of the glyph and going as far below the 0 line as one > chooses for a particular font. With Quest text I encoded this character > early with a line going vertically fr

Allocation of Georgian Extended block

2003-03-12 Thread Laurentiu Iancu
I noticed that a new Georgian Extended block was tentatively allocated at 2D00. The Myanmar block ends with four empty columns, just before the Georgian block, and the space could conceivably be used to extend the Georgian block while keeping it contiguous. I am sure there are good reasons for th

Re: Ligatures fj etc (from Re: Ligatures (qj) )

2003-03-12 Thread jameskass
. William Overington wrote, > I have added a new code recently, which is U+E700 STAFF which is a vertical > line from the very top of the glyph and going as far below the 0 line as one > chooses for a particular font. With Quest text I encoded this character > early with a line going vertically f

Re: Unicode character transformation through XSLT

2003-03-12 Thread Markus Scherer
Generally, try instantiating an InputStreamReader or similar from your input, with an explicit encoding="UTF8". That will perform the conversion from UTF-8 to the internal 16-bit Unicode that Java processes. Always use XYZReader classes for text input and XYZWriter classes for text output. java

RE: Unicode character transformation through XSLT

2003-03-12 Thread Jain, Pankaj (MED, TCS)
Hi Pim, Thanks for reply. I modified my program as per your suggestion(modified to byChunk&127) , but this time I am getting strange numbers. here is value in database E8C ? 6 to 10 and the value that i am getting in property file is.. value=69566732980193254321161113249483277721223277 But I n

Re: Unicode library that provides versioned Unicode API?

2003-03-12 Thread Markus Scherer
ICU4C 2.6 (June/July) will support Unicode 4 but also provide an option for Unicode 3.2 normalization (with NormalizationCorrections.txt applied though). http://oss.software.ibm.com/icu/ http://oss.software.ibm.com/pipermail/icu/2003-March/005406.html We do not have any plans so far to do this fo

Re: Unicode character transformation through XSLT

2003-03-12 Thread John Cowan
Pim Blokland scripsit: > As I understand it, char is a signed 16 bits type in Java; any of > the others may be unsigned. Hence the problem. Char is *unsigned*, all the others are always signed. -- "May the hair on your toes never fall out!" John Cowan --Thorin Oakenshield (to Bilbo

Re: Unicode character transformation through XSLT

2003-03-12 Thread Pim Blokland
Jain, Pankaj (MED, TCS) schreef: > while((chunk = ipStream.read())!=-1) > { > byte byChunk = new Integer(chunk).byteValue(); > strBuf.append((char) byChunk); > } You don't say which type your "chunk" variable is, but the problem is definitely in the number of conversions you do. In this tiny piec

RE: sorting order between win98/xp

2003-03-12 Thread Dominikus Scherkl
> > > Anyone know why the sort order is different under that two systems? > > As I mentioned: a new feature, keeping numbers ordered numerical. > In my opinion, the user should be allowed to turn this kind of > sorting off. Uuh. Maybe... I always found the "distribution" of options in Windows ver

RE: Unicode character transformation through XSLT

2003-03-12 Thread Jain, Pankaj (MED, TCS)
Hi ftang/james.. thanks for the details explanation. and now I the root problem of my error. I have following string is in database as Long in which the special character(?) is equivalent to ndash(-) E8C ? 6 to 10 And i am using following code to write the string from database to property

Unicode library that provides versioned Unicode API?

2003-03-12 Thread Simon Josefsson
Does anyone know of a unicode library that provides a versioned API? With "versioned API" I mean a method for the application to request that, e.g., unicode normalization is performed according to the rules specified in a specific Unicode version instead of simply the latest unicode version that t

Re: Need encoding conversion routines

2003-03-12 Thread Doug Ewell
askq1 askq1 wrote > I want c/c++ functions/routines that will convert Unicode to > UTF8/UTF16/UCS2 encodings and vice-versa. Can some-one point me where > can I get these code routines? As a quick side note, UCS-2 is simply UTF-16 without surrogate support (i.e. restricted to the BMP), and for t

Re: ZWNJ & Persian Collation

2003-03-12 Thread Markus Scherer
Roozbeh Pournader wrote: Well, anything that is completely ignored in collation creates problems with deterministic sorting. I don't think you mean "deterministic". UCA is deterministic, it just sorts many strings as equal. There are certain words in Persian, with completely different meanings, th

Re: RE: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-12 Thread Rick McGowan
> One can get daily news in Latin, too: http://www.yle.fi/fbc/latini/ Complete with a very nice recitation in Latin! http://www.yle.fi/fbc/latini/recitatio.html

RE: sorting order between win98/xp

2003-03-12 Thread Michael Everson
At 13:57 +0100 2003-03-12, Dominikus Scherkl wrote: > One of my colleague ask me this question. We use LCMapStringW > on WinXP and LCMapStringA on Win98 (by using LCMAP_SORTKEY ). And we got > different sorting order for the following Example of message list ordering in Win98: TESTING #1 TES

Re: Need encoding conversion routines

2003-03-12 Thread Rick McGowan
Marco C. suggested: > Unicode's reference implementation is here, but I don't know how much > up-to-date it is with some tiny changes in UTF-8: > http://www.unicode.org/Public/PROGRAMS/CVTUTF/ As far as I know, it is up to date with all latest UTF-8 changes. It may have a bug or two, but

Re: ISO 8859_2 and Windows 1250

2003-03-12 Thread Eric Muller
Otto Stolz wrote: CP 1250 contains the ISO 8859-1 characters, hence it is not suited for slavic laguages. I suspect that Otto meant to type "CP 1252 contains..." Eric.

Re: sorting order between win98/xp

2003-03-12 Thread Michael \(michka\) Kaplan
From: "Dominikus Scherkl" <[EMAIL PROTECTED]> > Yeah! One of the best features of XP - finaly I don't need to > insert leading zeroes to filenames to get them in the proper order > (even 9a is sorted before 10). > > > Anyone know is there a way to make them sort in the same > > order? > Why should

Re: ZWNJ & Persian Collation

2003-03-12 Thread Roozbeh Pournader
On Tue, 11 Mar 2003, Markus Scherer wrote: > The Unicode Collation Algorithm (UCA) for which allkeys.txt is the > default weight table does treat ZWNJ and a number of other characters as > special. For these, they are completely ignored by the UCA - same as if > you stripped them from the text. W

Re: ISO 8859_2 and Windows 1250

2003-03-12 Thread Otto Stolz
SRIDHARAN Aravind wrote: What is the basic difference between ISO 8859_2 and Windows 1250? CP 1250 contains all ISO 8859-2 characters, but some of them in different code positions, plus about two dozen characters, mostly from Unicode's General Punctuation range. Cf.

RE: Need encoding conversion routines

2003-03-12 Thread Marco Cimarosti
askq1 askq1 wrote: > I want c/c++ functions/routines that will convert Unicode to > UTF8/UTF16/UCS2 encodings and vice-versa. Can some-one point > me where can I get these code routines? Unicode's reference implementation is here, but I don't know how much up-to-date it is with some tiny changes

Ligatures fj etc (from Re: Ligatures (qj) )

2003-03-12 Thread William Overington
John Hudson wrote as follows. quote If you don't intend to use the PUA codepoint in text, there really is no point in having it at all. end quote Well, one useful scenario is as follows. Suppose please that one wishes to process incoming regular Unicode text, using a eutocode typography file t

RE: sorting order between win98/xp

2003-03-12 Thread Dominikus Scherkl
> One of my colleague ask me this question. We use LCMapStringW > on WinXP > and LCMapStringA on Win98 (by using LCMAP_SORTKEY ). And we got > different sorting order for the following > > Example of message list ordering in Win98: > TESTING #1 > TESTING #10 > TESTING #100 > TESTING #11 > > W

RE: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-12 Thread jarkko.hietaniemi
> One can get daily news in Latin, too: http://www.yle.fi/fbc/latini/ Correction: a weekly review.

RE: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-12 Thread jarkko.hietaniemi
> The same people consider Latin a dead language, suitable only for > study of ancient documents, which is clearly not the view taken > at the Vatican, which continues to produce new documents in that language. > In recent encyclicals, however, at least as published at www.vatican.va, > the æ an

Need encoding conversion routines

2003-03-12 Thread askq1 askq1
Hi, I want c/c++ functions/routines that will convert Unicode to UTF8/UTF16/UCS2 encodings and vice-versa. Can some-one point me where can I get these code routines? Thanks, ~K. _ Cricket - World Cup 2003 http://server1.msn.co.

RE: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-12 Thread Alan Wood
Christopher John Fynn wrote: > Print e.g. oestrogen (where oe represents a single > sound), but, e.g., chloro-ethane (not chloroethane) to avoid > confusion. Please don't try to apply these rules to chemical nomenclature - there are already enough people who get the hyphens wrong, without encou