Re: Question about Karabakh Characters

2017-10-05 Thread Michael Everson via Unicode
> So I decided to make a post. > > Kazunari Tsuboi > > -Original Message- > From: Michael Everson [mailto:ever...@evertype.com] > Sent: Wednesday, October 4, 2017 11:31 PM > To: Tsuboi, Kazunari > Cc: unicode Unicode Discussion > Subject: Re: Question about Karaba

RE: Question about Karabakh Characters

2017-10-04 Thread via Unicode
, Kazunari Cc: unicode Unicode Discussion Subject: Re: Question about Karabakh Characters They are not encoded, but that example is not sufficient. If you’d like to contact me offline we can discuss this further. Michael Everson > On 4 Oct 2017, at 08:39, via Unicode wrote: > > Hi there,

Re: Question about Karabakh Characters

2017-10-04 Thread Michael Everson via Unicode
They are not encoded, but that example is not sufficient. If you’d like to contact me offline we can discuss this further. Michael Everson > On 4 Oct 2017, at 08:39, via Unicode wrote: > > Hi there, > > The Karabakh language uses Armenian characters, but the following characters > do not ha

Re: Question about Perl5 extended UTF-8 design

2015-11-06 Thread Karl Williamson
On 11/06/2015 01:32 PM, Richard Wordingham wrote: On Thu, 05 Nov 2015 13:41:42 -0700 "Doug Ewell" wrote: Richard Wordingham wrote: No-one's claiming it is for a Unicode Transformation Format (UTF). Then they ought not to call it "UTF-8" or "extended" or "modified" UTF-8, or anything of the

Re: Question about Perl5 extended UTF-8 design

2015-11-06 Thread Richard Wordingham
On Thu, 05 Nov 2015 13:41:42 -0700 "Doug Ewell" wrote: > Richard Wordingham wrote: > > > No-one's claiming it is for a Unicode Transformation Format (UTF). > > Then they ought not to call it "UTF-8" or "extended" or "modified" > UTF-8, or anything of the sort, even if the bit-shifting algorithm

Re: Question about Perl5 extended UTF-8 design

2015-11-06 Thread Otto Stolz
Am 05.11.2015 um 23:11 schrieb Ilya Zakharevich: First of all, “reserved” means that they have no meaning. Right? Almost. “Reserved” means that they have currently no meaning but may be assigned a meaning, later; hence you ought not use them lest your programs, or data, be invalidated by late

Re: Question about Perl5 extended UTF-8 design

2015-11-05 Thread Philippe Verdy
2015-11-05 23:11 GMT+01:00 Ilya Zakharevich wrote > > • 128-bit architectures may be at hand (sooner or later). This is specialation for something that is still not envisioned: a global worldwide working space where users and applications would interoperate transparently in a giant virtualized

Re: Question about Perl5 extended UTF-8 design

2015-11-05 Thread Ilya Zakharevich
On Thu, Nov 05, 2015 at 08:57:16AM -0700, Karl Williamson wrote: > Several of us are wondering about the reason for reserving bits for > the extended UTF-8 in perl5. I'm asking you because you are the > apparent author of the commits that did this. To start, the INTERNAL REPRESENTATION of Perl’s

Re: Question about Perl5 extended UTF-8 design

2015-11-05 Thread Doug Ewell
Richard Wordingham wrote: > No-one's claiming it is for a Unicode Transformation Format (UTF). Then they ought not to call it "UTF-8" or "extended" or "modified" UTF-8, or anything of the sort, even if the bit-shifting algorithm is based on UTF-8. "UTF-8 encoding form" is defined as a mapping of

Re: Question about Perl5 extended UTF-8 design

2015-11-05 Thread Richard Wordingham
On Thu, 5 Nov 2015 18:25:05 +0100 Philippe Verdy wrote: > But these extra code points could be used to represent someting else > such as unique object identifier for internal use in your > application, or virtual object pointers, or or shared memory block > handles, file/pipe/stream I/O handles,

Re: Question about Perl5 extended UTF-8 design

2015-11-05 Thread Markus Scherer
On Thu, Nov 5, 2015 at 9:25 AM, Philippe Verdy wrote: > (0xFF was reserved only in the old RFC version of UTF-8 when it allowed > code points up to 31 bits, but even this RFC is obsolete and should no > longer be used and it has never been approved by Unicode). > No, even in the original UTF-8 d

Re: Question about Perl5 extended UTF-8 design

2015-11-05 Thread Philippe Verdy
It won't represent any valid Unicode codepoint (no standard scalar value defined), so if you use those leading bytes, don't pretend it is for "UTF-8" (not even "modified UTF-8" which is the variant created in Java for its internal serialization of unrestricted 16-bit strings, including for lone sur

Re: Question about the Sentence_Break property

2015-02-21 Thread Karl Williamson
On 02/20/2015 04:56 PM, Philippe Verdy wrote: 2015-02-20 6:14 GMT+01:00 Richard Wordingham mailto:richard.wording...@ntlworld.com>>: TUS has a whole section on the issue, namely TUS 7.0.0 Section 5.8. One thing that is missing is mention of the convention that a single newline charac

Re: Question about the Sentence_Break property

2015-02-20 Thread Konstantin Ritt
When UAX9 mentions a paragraph level, it says: > Paragraphs are divided by the Paragraph Separator or appropriate Newline Function (for guidelines on the handling of CR, LF, and CRLF, see *Section 4.4, Directionality*, and *Section 5.8, Newline Guidelines* of [Unicode

Re: Question about the Sentence_Break property

2015-02-20 Thread Philippe Verdy
2015-02-20 6:14 GMT+01:00 Richard Wordingham < richard.wording...@ntlworld.com>: > TUS has a whole section on the issue, namely TUS 7.0.0 Section 5.8. > One thing that is missing is mention of the convention that a single > newline character (or CRLF pair) is a line break whereas a doubled > newli

Re: Question about the Sentence_Break property

2015-02-19 Thread Richard Wordingham
On Thu, 19 Feb 2015 19:55:20 -0700 Karl Williamson wrote: > UAX 29 says this: > > Break after paragraph separators. > SB4. Sep | CR | LF > > Why are CR and LF considered to be paragraph separators? NEL and > Line Break are as well. > > My mental model of plain text has it containing embed

Re: Question about “Uppercase” in DerivedCoreProperties.txt

2014-11-10 Thread Steffen Nurpmeso
Philippe Verdy wrote: |The standard C++ "string" package could have then used this standard |internally in the methods exposed in its API. I cannot understand this |simple effort was never done on such basic functionality needed and used in |almost all softwares and OSes. There are plenty of

Re: Question about “Uppercase” in DerivedCoreProperties.txt

2014-11-10 Thread Steffen Nurpmeso
Philippe Verdy wrote: |Successors to convert strings instead of just isolated "characters" (sorry, |they are NOT what we need to handle "texts", they are not even equivalent |to Unicode characters, they are just code units, most often 8-bit with |"char" or 16-bit only with "wchar_t" !) already

Re: Question about “Uppercase” in DerivedCoreProperties.txt

2014-11-10 Thread Philippe Verdy
The equivalent of strtolower() and strtoupper() is implemented in all C libraries I know (yes, including glibc) and I have worked with on various OSes (and since very long!), even if their names change (because of the unfortunate lack of standardization about their interaction with C locales). The

Re: Question about “Uppercase” in DerivedCoreProperties.txt

2014-11-10 Thread Philippe Verdy
Successors to convert strings instead of just isolated "characters" (sorry, they are NOT what we need to handle "texts", they are not even equivalent to Unicode characters, they are just code units, most often 8-bit with "char" or 16-bit only with "wchar_t" !) already exist in all C libraries (incl

Re: Question about "Uppercase" in DerivedCoreProperties.txt

2014-11-10 Thread Doug Ewell
Philippe Verdy wrote: > glibc is not more borken and any other C library implementing toupper > and tolower from the legacy "ctype" standard library. These are old > APIs that are just widely used and still have valid contexts were they > are simple and safe to use. But they are not meant to conv

Re: Question about “Uppercase” in DerivedCoreProperties.txt

2014-11-10 Thread Steffen Nurpmeso
Philippe Verdy wrote: |glibc is not more borken and any other C library implementing toupper and |tolower from the legacy "ctype" standard library. These are old APIs that |are just widely used and still have valid contexts were they are simple and |safe to use. But they are not meant to conve

Re: Question about “Uppercase” in DerivedCoreProperties.txt

2014-11-08 Thread Philippe Verdy
glibc is not more borken and any other C library implementing toupper and tolower from the legacy "ctype" standard library. These are old APIs that are just widely used and still have valid contexts were they are simple and safe to use. But they are not meant to convert text. The i18n data just sh

Re: Question about “Uppercase” in DerivedCoreProperties.txt

2014-11-08 Thread Christopher Vance
So glibc is broken. This doesn't make it a Unicode problem. On Sat, Nov 8, 2014 at 8:22 PM, Mike FABIAN wrote: > Philippe Verdy さんはかきました: > > > note that tolower() and toupper() can only work one 1-character level, it > > is not recommended for use for changing case of plain text. > > > > For c

Re: Question about “Uppercase” in DerivedCoreProperties.txt

2014-11-08 Thread Philippe Verdy
Do not try to get consisant results with only a character to character mapping, it does not work with all letters, because sometimes you need 1->2 or 2->1 mappings (not all composable characters exist in precombined forms, or sometimes the combination must be split into its canonical decomposed equ

Re: Question about “Uppercase” in DerivedCoreProperties.txt

2014-11-08 Thread Mike FABIAN
Philippe Verdy さんはかきました: > note that tolower() and toupper() can only work one 1-character level, it > is not recommended for use for changing case of plain text. > > For correct handling of locales, to upper and toupper should be replaced by > strtolower and strtoupper (or their aliases) which w

Re: Question about “Uppercase” in DerivedCoreProperties.txt

2014-11-07 Thread Mike FABIAN
Philippe Verdy さんはかきました: > this is a "feature" of the Greek alphabet that the lowercase iota subscript > can be capitalized in two different ways : either as a subscript below the > uppercase main letter, or as a standard iota capitalized. The subscript > form is a combining character, but not th

Re: Question about “Uppercase” in DerivedCoreProperties.txt

2014-11-07 Thread Philippe Verdy
note that tolower() and toupper() can only work one 1-character level, it is not recommended for use for changing case of plain text. Its purpose should be limited to use cases where letters can be safely isolated from their context, for example when handling letters as numbers (e.g. section number

RE: Question about "Uppercase" in DerivedCoreProperties.txt

2014-11-06 Thread Laurentiu Iancu
Hello, The property Uppercase is a binary, informative property derived from General_Category (gc=Lu) and Other_Uppercase (OUpper=Y), as documented in Section 5.3 of UAX #44, at http://www.unicode.org/reports/tr44/#Uppercase. All of the characters you enumerated are titlecase letters (gc=Lt

Re: Question about “Uppercase” in DerivedCoreProperties.txt

2014-11-06 Thread Philippe Verdy
this is a "feature" of the Greek alphabet that the lowercase iota subscript can be capitalized in two different ways : either as a subscript below the uppercase main letter, or as a standard iota capitalized. The subscript form is a combining character, but not the non-subscript form. There shouls

Re: Question about a Normalization test

2014-10-23 Thread Aaron Cannon
On 10/23/14, Whistler, Ken wrote: > Test cases like this are included in NormalizationTest.txt precisely > to ensure that implementations are correctly detecting these > sequences where composition is blocked. And I am indeed glad that they are, as I completely missed this small but critical deta

RE: Question about a Normalization test

2014-10-23 Thread Whistler, Ken
Aaron Cannon asked: > Hi all, from the latest version of the standard, on line 16977 of the > normalization tests, I am a bit confused by the NFC form. It appears > incorrect to me. Here's the line, sans comment: > > 0061 0305 0315 0300 05AE 0062;0061 05AE 0305 0300 0315 0062;0061 05AE >

Re: Question about a Normalization test

2014-10-23 Thread Mark Davis ☕️
On Thu, Oct 23, 2014 at 6:54 PM, Aaron Cannon < cann...@fireantproductions.com> wrote: > 0061 05AE 0305 0300 0315 0062 http://unicode.org/cldr/utility/list-unicodeset.jsp?a=%5Cu0061+%5Cu05AE+%5Cu0305+%5Cu0300+%5Cu0315+%5Cu0062&g=ccc ​0305 and 0300 have the same ccc, so the first one blocks the

Re: Question about WordBreak property rules

2014-07-24 Thread Karl Williamson
On 07/24/2014 01:38 PM, Karl Williamson wrote: http://www.unicode.org/draft/reports/tr29/tr29.html#WB6 indicates that there should be no break between the first two letters in the sequence Hebrew_Letter Single_Quote Hebrew_Letter. However, rule 7a just below indicates that there should be no bre

Re: question to Akkadian

2014-05-19 Thread Werner LEMBERG
>> I'm interested in representing one of the so-called Hurrian songs >> (tablet H.6, containing musical notation) with Unicode, cf. >> >> https://en.wikipedia.org/wiki/Hurrian_songs > > That says it represents qáb, which seems to be a version of Labat > 88, which is U+1218F KAB. > > Unfortunat

Re: question to Akkadian

2014-05-19 Thread Tom Gewecke
On May 19, 2014, at 9:21 AM, Werner LEMBERG wrote: > > I'm interested in representing one of the so-called Hurrian songs > (tablet H.6, containing musical notation) with Unicode, cf. > > https://en.wikipedia.org/wiki/Hurrian_songs That says it represents qáb, which seems to be a version of La

Re: question to Akkadian

2014-05-19 Thread Werner LEMBERG
>> If I have a cuneiform text, where can I find glyph images to >> identify them? > > You might want to specify what you mean by "text". A photo of an > inscription? Something from a printed book? I'm interested in representing one of the so-called Hurrian songs (tablet H.6, containing musica

Re: question to Akkadian

2014-05-19 Thread Tom Gewecke
On May 19, 2014, at 8:40 AM, Werner LEMBERG wrote: > If I have a cuneiform > text, where can I find glyph images to identify them? You might want to specify what you mean by "text". A photo of an inscription? Something from a printed book? Because of the considerable variation in glyphs ove

Fwd: Re: Question about normalization tests

2012-12-10 Thread Edwin Hoogerbeets
Ah yes, I did indeed miss the "equal to" part. I fixed up my code and now it works as expected. Thanks to Mark and Ken for your help and speedy response! Edwin On 12/10/2012 12:57 PM, Whistler, Ken wrote: > > Your misunderstanding is at the highlighted statement below. Actually > 0300 **is** blo

RE: Question about normalization tests

2012-12-10 Thread Whistler, Ken
Your misunderstanding is at the highlighted statement below. Actually 0300 *is* blocked from 0061 in this sequence, because it is preceded by a character with the same canonical combining class (i.e. U+0305, ccc=230). A blocking context is the preceding combining character either having ccc=0 or

Re: Question about normalization tests

2012-12-10 Thread Mark Davis ☕
0300 *is* blocked, because there is a preceding character (0305) that has the same combining class (230). Mark * * *— Il meglio è l’inimico del bene —* ** On Mon, Dec 10, 2012 at 11:55 AM, Edwin Hoogerbeets wrote: > Looking at 0300, it is also no

Re: Question on U+33D7

2012-02-24 Thread Matt Ma
On Fri, Feb 24, 2012 at 5:18 AM, Shriramana Sharma wrote: > Grandpa grandpa I wanna hear the story about the turtles *now*! :-) > > Sent from my Android phone Thanks all for the enlightening reply. My intent was sorting using UCA but it really did not matter much because U+33D7 was sorted after

Re: Question on U+33D7

2012-02-24 Thread Shriramana Sharma
Grandpa grandpa I wanna hear the story about the turtles *now*! :-) Sent from my Android phone

Re: Question on U+33D7

2012-02-23 Thread Ken Whistler
On 2/23/2012 2:44 PM, António Martins-Tuválkin wrote: It is defined as > "33D7;SQUARE PH;So;0;L; 0050 0048N;SQUARED PH" > in UnicodeData.txt, but it is shown as "pH" in code chart. Should it be > "0070 0048" or "PH"? It should certainly be "pH", i.e., "0070 0048", because that's the

Re: Question on U+33D7

2012-02-23 Thread Asmus Freytag
On 2/23/2012 2:44 PM, António Martins-Tuválkin wrote: On 2012/2/23 Matt Ma wrote: It is defined as "33D7;SQUARE PH;So;0;L; 0050 0048N;SQUARED PH" in UnicodeData.txt, but it is shown as "pH" in code chart. Should it be "0070 0048" or "PH"? It should certainly be "pH", i.e., "0070 0048

Re: Question on U+33D7

2012-02-23 Thread António Martins-Tuválkin
On 2012/2/23 Matt Ma wrote: > It is defined as > "33D7;SQUARE PH;So;0;L; 0050 0048N;SQUARED PH" > in UnicodeData.txt, but it is shown as "pH" in code chart. Should it be > "0070 0048" or "PH"? It should certainly be "pH", i.e., "0070 0048", because that's the peculiar casing in widesprea

Re: Question on UCA collation parameters (strength = tertiary, alternate = shifted)

2011-12-01 Thread Matt Ma
In addition, the default setting in Table 14, UTS #10, 6.0.0 are strength: tertiary alternative: shifted But the setting won't generate the conformant behavior specified by CollationTest_SHIFTED.txt I think when alternative is set to shifted, strength should be set to quaternary (as default)

Re: Question on UCA collation parameters (strength = tertiary, alternate = shifted)

2011-11-29 Thread Matt Ma
Thanks for clarification. But to pass UCA conformance test on Shifted, does the strength have to be set to quaternary? Howeve, it is stated in UCA, C2, "A conformant implementation shall support at least three levels of collation". Does this mean a UCA conformant implementation only need pass UCA

Re: Question on UCA collation parameters (strength = tertiary, alternate = shifted)

2011-11-29 Thread Mark Davis ☕
Yes, if the strength is tertiary, then Blanked and Shifted give the same results. http://www.unicode.org/reports/tr10/proposed.html#Variable_Weighting Mark *— Il meglio è l’inimico del bene —* * * * [https://plus.google.com/114199149796022210033] * On Tue, Nov 29, 2011 at 19:11, Matt Ma wrote

Re: Question on UCA collation parameters (strength = tertiary, alternate = shifted)

2011-11-29 Thread Ken Whistler
On 11/29/2011 11:11 AM, Matt Ma wrote: Does Shifted implies strength being quaternary? If strength stays as tertiary (default or explicitly set), it seems the collation behavior is Blanked. Please clarify. No. Shifted is a particular strategy for handling the "variable collation elements" (sta

Re: Question on Canonical equivilance

2004-11-25 Thread Antoine Leca
On Wednesday, November 24th, 2004 16:26Z Tim Greenwood va escriure: > All of the spacing combining marks (general category Mc) except > musical symbols have a canonical combining class of 0. > Why is this? About the Indic vowel signs, I assume it is this way to avoid them being reordered (in weir

RE: Question on Canonical equivilance

2004-11-24 Thread Kenneth Whistler
Tim Greenwood asked: > > All of the spacing combining marks (general category Mc) except > > musical symbols have a canonical combining class of 0. So, for example > > > > 0B95 (TAMIL LETTER KA) 0BC7 (TAMIL VOWEL SIGN EE - stands to the left > > of the consonant) 0BBE (TAMIL VOWEL SIGN AA - on th

RE: Question on Canonical equivilance

2004-11-24 Thread Peter Constable
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf > Of Tim Greenwood > All of the spacing combining marks (general category Mc) except > musical symbols have a canonical combining class of 0. So, for example > > 0B95 (TAMIL LETTER KA) 0BC7 (TAMIL VOWEL SIGN EE - stands to the left >

Re: Question on CLDR number patterns

2004-05-25 Thread Eric Muller
Mark Davis wrote: Â The decimal format looks like the following: Â #,##0.###;#,##0.###- I was actually looking the locales through the ICU explorer, which apparently replaces the localizable characters by those specified in the , hence my confusion. Â (We s

Re: Question on CLDR number patterns

2004-05-25 Thread Mark Davis
ï Eric,   The decimal format looks like the following:   #,##0.###;#,##0.###-   http://oss.software.ibm.com/cvs/icu/~checkout~/locale/diff/main/ar_KW.html#71   The digit zero is the Arabic-Indic symbol:   http://oss.software.ibm.com/cvs/icu/~checkout~/locale/diff/main/ar_KW.html#162   This has lo

Re: Question on Unicode-prevalence (general and for Cyrillic)

2004-03-15 Thread Antoine Leca
Peter Kirk va escriure: >> 2. A graduate student mentioned that it was her impression that most >> Cyrillic webpages (at least for Russian--her interest) are still not >> encoded in Unicode. (She is doing some research on the use of >> certain words in Russian and wanted to know how best to do the

Re: Question on Unicode-prevalence (general and for Cyrillic)

2004-03-14 Thread Peter Kirk
On 14/03/2004 12:25, Deborah W. Anderson wrote: ... 2. A graduate student mentioned that it was her impression that most Cyrillic webpages (at least for Russian--her interest) are still not encoded in Unicode. (She is doing some research on the use of certain words in Russian and wanted to know h

Re: Question about properties of some Code Points

2003-07-22 Thread Asmus Freytag
At 04:50 AM 7/22/03 +0200, Chris Jacobs wrote: > Where am I going with this? Basically what I'm after is a clean/clear > way to tell if quotation marks and parentheses (plus the other > bracketing characters such as '[' or '{' are opening or closing > punctuation. That's the real question here!

Re: Question about properties of some Code Points

2003-07-21 Thread Chris Jacobs
> Where am I going with this? Basically what I'm after is a clean/clear > way to tell if quotation marks and parentheses (plus the other > bracketing characters such as '[' or '{' are opening or closing > punctuation. That's the real question here! How would you do that > using properties and ca

RE: Question about Unicode Ranges in TrueType fonts

2003-06-27 Thread Peter_Constable
> but premature standardization can > also be a problem if the wrong choices get codified too soon. As in canonical combining classes? :-) - Peter --- Peter Constable Non-Roman Script Initiative, SIL International 7500 W

Re: Question about Unicode Ranges in TrueType fonts

2003-06-26 Thread Philippe Verdy
On Thursday, June 26, 2003 8:16 PM, Elisha Berns <[EMAIL PROTECTED]> wrote: > It would appear from your answer that even after implementing the > algorithm to search the Unicode block coverage of a font, the actual > comparison "data", that is which blocks to compare and how many code > points, is

RE: Question about Unicode Ranges in TrueType fonts

2003-06-26 Thread Kenneth Whistler
Elisha Berns asked: > It would appear from your answer that even after implementing the > algorithm to search the Unicode block coverage of a font, the actual > comparison "data", that is which blocks to compare and how many code > points, is totally undefined. Is there any kind of standard for >

Re: Question about Unicode Ranges in TrueType fonts

2003-06-26 Thread John Cowan
Elisha Berns scripsit: > It's odd to think that the old way of using Charset identifiers in fonts > worked a lot more cleanly for finding fonts matching a language/language > group. I would think this kind of core issue would be addressed more > cleanly by the font standard. Actually it worked b

Re: Question about Unicode Ranges in TrueType fonts

2003-06-26 Thread Philippe Verdy
On Thursday, June 26, 2003 4:13 PM, Andrew C. West <[EMAIL PROTECTED]> wrote: > On Thu, 26 Jun 2003 14:26:13 +0200, "Philippe Verdy" wrote: > > > Isn't there a work-around with the following function (quote from > > Microsoft MSDN): > > (with the caveat that you first need to allocate and fill a

RE: Question about Unicode Ranges in TrueType fonts

2003-06-26 Thread Elisha Berns
Andrew West wrote: > By looping through the "ranges" array it is possible to determine exactly > which > characters in which Unicode blocks a given font covers (as long as your > sofware > has an array of Unicode blocks and their codepoint ranges). > As long as your software has an up-to-date li

Re: Question about Unicode Ranges in TrueType fonts

2003-06-26 Thread Andrew C. West
On Thu, 26 Jun 2003 14:26:13 +0200, "Philippe Verdy" wrote: > Isn't there a work-around with the following function (quote from Microsoft > MSDN): > (with the caveat that you first need to allocate and fill a Unicode string for > the > codepoints you want to test, and this can be lengthy if one wa

Re: Question about Unicode Ranges in TrueType fonts

2003-06-26 Thread Philippe Verdy
On Thursday, June 26, 2003 2:26 PM, Philippe Verdy <[EMAIL PROTECTED]> wrote: I forgot also the probably better function from the Uniscribe library, which processes strings through a language-dependant shaping algorithm, and can determine appropriate glyph substitution, or use custom composite f

Re: Question about Unicode Ranges in TrueType fonts

2003-06-26 Thread Philippe Verdy
On Thursday, June 26, 2003 11:50 AM, Andrew C. West <[EMAIL PROTECTED]> wrote: > On Wed, 25 Jun 2003 21:58:28 -0700, "Elisha Berns" wrote: > > > Some weeks back there were a number of postings about software for > > viewing Unicode Ranges in TrueType fonts and I had a few questions > > about that

Re: Question about Unicode Ranges in TrueType fonts

2003-06-26 Thread Andrew C. West
On Wed, 25 Jun 2003 21:58:28 -0700, "Elisha Berns" wrote: > Some weeks back there were a number of postings about software for > viewing Unicode Ranges in TrueType fonts and I had a few questions about > that. Most viewers listed seemed to only check the Unicode Range bits of > the fonts which can

Re: Question about CollationTest_NON_IGNORABLE.txt & NormalizationTest.txt

2003-03-10 Thread Markus Scherer
No takers for this question? Let me try... askq1 askq1 wrote: The CollationTest_NON_IGNORABLE.txt & NormalizationTest.txt contain test-cases for sorting and normalization. The strings that are mentioned in these files follow a specific order: ... I want to know if these files are organized consi

RE: question about Windows-1252 and Unicode mapping

2003-02-27 Thread Murray Sargent
As KenW pointed out, I meant May 1998, not 1988! Thanks Murray -Original Message- From: Murray Sargent Sent: Thursday, February 27, 2003 3:44 PM To: 'Yung-Fong Tang' Cc: John Myers; Takayuki Tei; kat momoi; Naoki Hotta; Cathy Wissink; [EMAIL PROTECTED] Subject: RE: ques

RE: question about Windows-1252 and Unicode mapping

2003-02-27 Thread Cathy Wissink
Actually, that would be 19*9*8. The Euro was added to all Windows code pages except 932. In addition, the Z caron was added to 1252 at that time. The version of this code page first shipped on the NT platform with Windows 2000, and did ship on the Win9x platform with Windows ME. (I'm not sure ab

RE: question about Windows-1252 and Unicode mapping

2003-02-27 Thread Murray Sargent
I think the Euro at 0x80 for 1252 (and several other 125x code pages) was added in May 1988. Cathy Wissink can confirm this. It certainly happened before 1999, since we added support for it in RichEdit 3.0 which shipped with Windows 2000 and Office 2000. Murray -Original Message- From: Yu

RE: Question: the german umlaut

2002-11-11 Thread Dominikus Scherkl
> I just wanted to know how much space in bytes the Latin-1 > characters such as the german umlaut characters take up in > UTF-8 encoding. Is it still just one byte or does it now > require 2 bytes? U+ up to U+007F take 1 byte (ASCII) U+0080 up to U+07FF take 2 bytes (Latin-1, Latin extended

Re: Question: the german umlaut

2002-11-08 Thread James E. Agenbroad
On Fri, 8 Nov 2002, Magda Danish (Unicode) wrote: > > > > -Original Message- > > > > Date/Time:Fri Nov 8 09:05:40 EST 2002 > > Contact: [EMAIL PROTECTED] > > Report Type: Other Question, Problem, or Feedback > > > > Hello > > > > I just wanted to know how much space in byte

Re: Question: the german umlaut

2002-11-08 Thread Stefan Persson
- Original Message - From: "Magda Danish (Unicode)" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Friday, November 08, 2002 9:17 PM Subject: Question: the german umlaut > > I just wanted to know how much space in bytes the Latin-1 > > characters such as the german umlaut characters

Re: Question

2002-01-15 Thread Asmus Freytag
At 06:26 PM 1/15/02 -0800, Kenneth Whistler wrote: > > Hello. I am looking for help with Unicode. I was recently told by my > credit > > card processing company that I need to Upgrade my site to unicode 3.2 in > > order to get a perl script working. > >There has got to be a disconnect here somew

Re: Question

2002-01-15 Thread John Hudson
At 18:26 1/15/2002, Kenneth Whistler wrote: >And you don't *install* Unicode. Wouldn't it be nice if it was that easy! John Hudson Tiro Typeworks www.tiro.com Vancouver, BC [EMAIL PROTECTED] ... es ist ein unwiederbringliches Bild der Vergangenheit, das mit jeder Gegenwart

Re: Question

2002-01-15 Thread Kenneth Whistler
> Hello. I am looking for help with Unicode. I was recently told by my credit > card processing company that I need to Upgrade my site to unicode 3.2 in > order to get a perl script working. There has got to be a disconnect here somewhere. Unicode 3.2 hasn't been released yet, and won't be for

Re: Question

2002-01-15 Thread Barry Caplan
Can you describe the nature of the script and how it uses Unicode (if at all) or what it uses for text processing. What version of Unicode are you using now for your data? Best regards, Barry Caplan At 05:15 PM 1/15/2002 -0800, BBCOA Webmaster wrote: > Hello. I am looking for help with Un

Re: Question about some MS IE options

2001-12-04 Thread Otto Stolz
Robert M. Gerlach wrote: > When saving a webpage from within Microsoft Internet Explorer, Which Version? I've tested your issue with version 5 SP2 (more precisely: 5.00.3314.2101). > there are a few notable options... ...for the encoding of the file. > and I'm really unsure as to what the di

Re: Question about some compatibility characters

2001-10-29 Thread Peter_Constable
>In the IETF Internationalised Domain Names working group, there is some >discussion about which normalisation mapping to apply to names entered >by users... >I.e. can anyone who has an appropriate keyboard try to determine whether >these characters can be entered in a Unicode-enabled entry fie

RE: Question about some compatibility characters

2001-10-29 Thread Marco Cimarosti
David Hopwood wrote: > [...] > The following characters have compatibility mappings but not canonical > mappings (and also satisfy some other criteria that aren't really > important to my question): > [...] > The question I'd like to ask is whether they are produced in practice > by common keyboar

Re: Question regarding Mac OS X Unicode support

2001-07-16 Thread Yung-Fong Tang
"John H. Jenkins" wrote: > At 9:36 AM -0400 7/16/01, Patrick Rourke wrote: > >This is probably a FAQ, but I couldn't find it either in the Unicode > >archives on egroups or on the Apple website [which doesn't mean it's not > >there] . . . is there a distinction between the Unicode support in Ca

Re: Question regarding Mac OS X Unicode support

2001-07-16 Thread John H. Jenkins
At 9:36 AM -0400 7/16/01, Patrick Rourke wrote: >This is probably a FAQ, but I couldn't find it either in the Unicode >archives on egroups or on the Apple website [which doesn't mean it's not >there] . . . is there a distinction between the Unicode support in Carbon >and Cocoa? For the ranges I'm

Re: Question about UTR#24

2001-05-29 Thread Kenneth Whistler
Marco asked: > I have a question about the file > , the data file for > UTR#24 (Script Names). > > I see that script-specific combining characters are normally assigned to > that script. However, a few of them are in the INHERITED class: > > A

Re: Question on Unicode data files

2001-02-28 Thread Jianping Yang
Mr Zhang is CEO of that company. Regards, Jianping. John Jenkins wrote: > On Monday, February 26, 2001, at 09:12 PM, Richard Cook wrote: > > > Is there any connection between this http://www.unihan.com.cn/ site and > > IRG? What is UniHan Digital Tech Co.? Their website has some rather > > anno

Re: Question on Unicode data files

2001-02-27 Thread John Jenkins
On Monday, February 26, 2001, at 09:12 PM, Richard Cook wrote: > Is there any connection between this http://www.unihan.com.cn/ site and > IRG? What is UniHan Digital Tech Co.? Their website has some rather > annoying graphics and windows, but no basic info that i can see ... the > bottom button

Re: Question on Unicode data files

2001-02-26 Thread Richard Cook
"John H. Jenkins" wrote: > > At 7:57 AM -0800 2/26/01, Richard Zhang wrote: > >Hello, Marco, > > > >Unihan is the official site I think. You can visit www.unihan.com.cn for > >more information about this, if you know Chinese :). Knowing Chinese is not enough. You and your browser need to know Si

Re: Question on Unicode data files

2001-02-26 Thread John H. Jenkins
At 7:57 AM -0800 2/26/01, Richard Zhang wrote: >Hello, Marco, > >Unihan is the official site I think. You can visit www.unihan.com.cn for >more information about this, if you know Chinese :). > >If you sign up for cooperation with them, you will get full access to their >database. > No, Unihan is

Re: Question on Unicode data files

2001-02-26 Thread Kenneth Whistler
Marco asked: > > The Unicode FTP site (ftp://ftp.unicode.org/Public, now temporarily remapped > on http://www.unicode.org/Public) contains several files with mappings of > East Asian character sets to/from Unicode. > > Are all these sources in sync? If not, which ones is it better to trust? >

Re: Question on Unicode data files

2001-02-26 Thread Richard Zhang
Hello, Marco, Unihan is the official site I think. You can visit www.unihan.com.cn for more information about this, if you know Chinese :). If you sign up for cooperation with them, you will get full access to their database. Is this helpful to you? Best regards, Richard - Original Mess

RE: Question on Unicode data files

2001-02-26 Thread Marco Cimarosti
I wrote > - UNIDATA/CJKXREF.TXT ([...] Errata: I meant UNIDTA/UNIHAN.TXT Sorry. _ Marco

RE: Question

2000-08-02 Thread addison
> > -Original Message- > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] > > Sent: Tuesday, August 01, 2000 6:49 PM > > To: Unicode List > > Cc: Unicode List > > Subject: RE: Question > > > > > > Hi Vinit, > > > > Actually,

Re: Question

2000-08-02 Thread Mark Davis
e. > > If i have any questions further, i will ask you. > Thanks! > > Thanks and regards, > Vinit Bhatt > > > -Original Message- > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] > > Sent: Tuesday, August 01, 2000 6:49 PM > > To: Unicode List

RE: Question

2000-08-02 Thread Vinit Bhatt
[EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, August 01, 2000 6:49 PM > To: Unicode List > Cc: Unicode List > Subject: RE: Question > > > Hi Vinit, > > Actually, the Locale class is built into the Java language. > > Perhaps my previous message was un

RE: Question

2000-08-01 Thread addison
tutorial. Can you please guide me through a specific URL on > javasoft where > i can find the example classes templates on Unicode ? That will really help > in > coding my efforts. thanks a lot. > > Thanks and regards, > Vinit Bhatt > 703-344-6942 > > > &g

RE: Question

2000-08-01 Thread Vinit Bhatt
0 1:16 PM > To: Unicode List > Cc: Unicode List > Subject: Re: Question > > > Well > > A list of languages supported by Unicode is fairly long (and a complex > topic). > > The Java programming language has varying levels of support for a variety > of languag

Re: Question regarding bidirectional algorithm

2000-08-01 Thread Timothy Partridge
Markus Scherer recently said: > > David Tooke wrote: > > The bidirectional algorithm mentions mirrored glyphs. The reference code handles >them by replacing these characters with their mirror image. Is this the preferred >method of doing this? If so, is there any where in the Unicode datab

Re: Question

2000-08-01 Thread addison
Well A list of languages supported by Unicode is fairly long (and a complex topic). The Java programming language has varying levels of support for a variety of languages. This support is evolving, even as I write. For example: There is no (built-in) support for calendars other than the Gr

  1   2   >