Re: Combining Marks and Variation Selectors

2020-02-02 Thread Eric Muller via Unicode
That would imply some coordination among variations sequences on different code points, right? E.g. <0B48> ≡ <0B47, 0B56>, so a variation sequence on 0B56 (Mn, ccc=0) would imply the existence of a variation sequence on 0B48 with the same variation selector

Re: Keyboard layouts and CLDR

2018-01-30 Thread Eric Muller via Unicode
Indeed. But "Faÿ-lès-Nemours" / "FAŸ-LÈS-NEMOURS". "lès" in French place names means "near", typically followed by another city name or a river name. In the case of "L'Haÿ-les-Roses", it's just that they have a famous rose garden, so "les". Eric. On 1/30/2018 12:06 AM, Martin J. Dürst via

0027, 02BC, 2019, or a new character?

2018-01-15 Thread Eric Muller via Unicode
https://www.nytimes.com/2018/01/15/world/asia/kazakhstan-alphabet-nursultan-nazarbayev.html Eric.

Re: Database missing/erroneous information

2017-07-12 Thread Eric Muller via Unicode
In the .grouped.xml file, if a does not have an attribute, it inherits it from its containing element. The group containing the digits has  IDC="Y" OIDC="N" XIDC="Y", and so that applies to the digits as well. If you don't want to deal with the inhe

Bengla syllables <... 09BF 09BE> and <... 09BF 09C0>

2017-02-07 Thread Eric Muller
In looking at the wiki{pedia,book.source,tionary} corpus for Bengla, I see a relatively large number of syllables with <... 09BF 09BE> or <... 09BF 09C0>. I checked a couple of sources, and I did not find them listed anywhere as being normally used. Are they in normal use or are those all typ

how would you state requirements involving sorting?

2017-01-23 Thread Eric Muller
Suppose you help somebody write requirements for a piece of software and you see an item: Sorting. Diacritic marks need to be stripped when sorting titles You know that sorting is a lot more complicated than removing diacritics, and that giving the di

Re: "textels"

2016-09-16 Thread Eric Muller
On 9/16/2016 8:30 AM, Janusz S. Bien wrote: Quote/Cytat - Eric Muller (pią, 16 wrz 2016, 17:03:54): On 9/16/2016 6:52 AM, Janusz S. Bień wrote: (when working on a corpus of historical Polish we noticed some cases where standard Unicode equivalence was not convenient). I'm very inter

Re: "textels"

2016-09-16 Thread Eric Muller
On 9/16/2016 6:52 AM, Janusz S. Bień wrote: (when working on a corpus of historical Polish we noticed some cases where standard Unicode equivalence was not convenient). I'm very interested to know more about those cases. Thanks, Eric.

Emoji Feminism - The New York Times

2016-03-13 Thread Eric Muller
http://www.nytimes.com/2016/03/13/opinion/sunday/emoji-feminism.html?_r=0

The Chinese Typewriter: The Design and Science of East Asian Information Technology

2016-01-16 Thread Eric Muller
For those who are in the San Francisco Bay Area: https://library.stanford.edu/eal The Chinese Typewriter: The Design and Science of East Asian Information Technology During the 19th and 20th centuries,

Re: Proposal for German capital letter "ß"

2015-12-10 Thread Eric Muller
On 12/10/2015 2:45 AM, Frédéric Grosshans wrote: Le 10/12/2015 05:32, Martin J. Dürst a écrit : A similar example is the use of accents on upper-case letters in French in France where 'officially', upper-case letters are written without accents. Actually, the official body in charge of this (Ac

Toki Pona: A Language With a Hundred Words - The Atlantic

2015-07-28 Thread Eric Muller
http://www.theatlantic.com/technology/archive/2015/07/toki-pona-smallest-language/398363/ Eric.

Re: UDHR in Unicode: 400 translations in text form!

2015-06-29 Thread Eric Muller
On 6/28/2015 12:30 PM, Ken Shirriff wrote: I don't mean to be critical, but I find the UDHR page is really hard to use. Thanks for the observations. I'll try to find a better organization. Eric.

Re: UDHR in Unicode: 400 translations in text form!

2015-06-29 Thread Eric Muller
On 6/28/2015 12:20 PM, Philippe Verdy wrote: Note: The marker icons showing languages in the Leaflet component (over the OSM map) are not working (broken links) Fixed, I believe. Also the locations assigned of some international languages is strange: Esperanto ... Picard ... Standard French

Re: UDHR in Unicode: 400 translations in text form!

2015-06-29 Thread Eric Muller
On 6/28/2015 10:24 PM, Leo Broukhis wrote: Ukrainian is in Estonia, Estonian is in the Baltic sea. I took the locations from glottolog.org. The first error is mine, I mistyped a value. The second error comes from Glottolog, I corrected and reported to them. Will appear in the next update.

UDHR in Unicode: 400 translations in text form!

2015-06-28 Thread Eric Muller
I am pleased to announce that the UDHR in Unicode project (http://unicode.org/udhr) has reached a notable milestone: we now have 400 translations of the Universal Declaration of Human Rights in text form. The latest translation is in Sinhala, thanks to Keshan Sodimana, Pasundu de Silva and Sas

Re: WORD JOINER vs ZWNBSP

2015-06-26 Thread Eric Muller
On 6/26/2015 3:48 AM, Marcel Schneider wrote: To do traditional French typography on the PC, or anywhere a justifying no-break space is needed along with the colon, because this punctuation must be placed in the middle between the

Help with African characters, please

2015-06-21 Thread Eric Muller
Can you help me identify the characters used in the Kulango, Bouna translation of the UDHR? The text is at . Look for article 14. What is the second letter of the word for "article" (after the N, looks like a greek nu), and w

Re: Another take on the English apostrophe in Unicode

2015-06-12 Thread Eric Muller
On 6/10/2015 9:37 PM, Philippe Verdy wrote: The French "pomme de terre" ("potato" in English, French vulgar synonym : "patate") is a single lemma in dictionaries, but is still 3 separate words (only the first one takes the plural mark), it is

Re: Another take on the English apostrophe in Unicode

2015-06-05 Thread Eric Muller
On 6/5/2015 10:29 AM, John D. Burger wrote: Linguistically, "don't" and friends pass all the diagnostics that indicate they're single words. If I am not mistaken, the french "pomme de terre" also passes the diagnostics. So we need a new space character. Eric.

Re: ucd beta, stable filenames

2015-06-05 Thread Eric Muller
On 6/5/2015 8:48 AM, Daniel Bünzli wrote: Hello, Would it be possible in the future to publish the latest version of the ucd files without the -X.Y.ZdW suffixes under a fixed URI like http://www.unicode.org/Public/beta/ and/or simply publish it in the version directory but without the suff

Re: Tag characters

2015-05-26 Thread Eric Muller
On 5/21/2015 1:25 PM, Asmus Freytag (t) wrote: On 5/21/2015 8:46 AM, Peter Constable wrote: Would Unicode really want to get into the business of running a UFL se

Re: Tag characters

2015-05-20 Thread Eric Muller
On 5/20/2015 7:11 PM, Doug Ewell wrote: In any event, URLs that point to images would be an awful basis for an encoding. I would make an exception for the URL http://unicode.org/Public/8.0.0/ucd/StandardizedFlags.html. Eric.

Re: Usage stats?

2015-03-27 Thread Eric Muller
Would a corpus like wikipedia or Project Gutenberg be appropriate for you purpose ? Both are freely and easily accessible. and . Eric. _

Séminaire doctoral "Chemins des écritures" | Gripic

2014-12-17 Thread Eric Muller
This seminar may be of interest to those in France. http://www.gripic.fr/evenement/seminaire-doctoral-chemins-ecritures Eric. ___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode

Re: fonts for U7.0 scripts

2014-10-23 Thread Eric Muller
How about even having just the glyphs in the Unicode.org charts being in the public domain? Very easy to achieve: 1. Ask the owner of the font how much money he wants to part with his property. 2. Write a check for the corresponding amount. 3. You are now the owner, you can put the font in

Re: Help with Hebrew

2014-07-26 Thread Eric Muller
Many thanks for all the answers on my Hebrew and Arabic questions. On 7/6/2014 4:18 AM, Matitiahu Allouche wrote: The original text is interesting, combining French, Latin and Hebrew. There is also a fair amount of Greek, and a couple of Arabic words. Unfortunately, the author and/or the typ

Parsers for the UnicodeSet notation?

2014-07-23 Thread Eric Muller
I would like to work with the exemplarCharacters data in the CLDR. That uses the UnicodeSet notation. Is there somewhere a parser for that notation, that would return me just the list of characters in the set? Something a bit like the UnicodeSet utility at

Help with arabic

2014-07-05 Thread Eric Muller
I am working of the digitization of a text that includes arabic; could somebody please tell me what is the Unicode representation of the (short) fragments on those two pages? http://gallica.bnf.fr/ark:/12148/bpt6k6439352j/f33.image http://gallica.bnf.fr/ark:/12148/bpt6k6439352j/f474.image Tha

Time to learn French!

2014-05-08 Thread Eric Muller
http://www.forbes.com/sites/pascalemmanuelgobry/2014/03/21/want-to-know-the-language-of-the-future-the-data-suggests-it-could-be-french/ http://www.france24.com/en/20140326-will-french-be-world-most-spoken-language-2050/ http://www.boston.com/bostonglobe/ideas/brainiac/2014/03/the_language_of_1.

Re: Editing Sinhala and Similar Scripts

2014-03-19 Thread Eric Muller
On 3/19/2014 7:57 AM, Peter Constable wrote: It is nonsensical to talk about erasing a _keystroke_. "undo", "revert" the effect of a keystroke. The concept is meaningful. Eric. ___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/

Transforming BidiTest.txt to the format of BidiCharacterTest.txt

2014-02-12 Thread Eric Muller
Does anybody have a program that transforms the UCD file BidiTest.txt to the format of BidiCharacterTest.txt, and that they are willing to share? Thanks, Eric. ___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode

Representation of neutral tone in pinyin and bopomofo

2013-11-13 Thread Eric Muller
Is it correct that: - in bopomofo, the neutral (or light) tone is represented by U+02D9 ˙ DOT ABOVE, and in the text representation, that character follows the bopomofo characters of the syllable (just like all the other characters for tones) - in pinyin, the neutral tone is typically not ma

Re: Can the combining diacritical marks combine with any base character?

2013-02-12 Thread Eric Muller
On 2/11/2013 12:49 AM, Richard Wordingham wrote: The problem sequence is which is canonically equivalent to . Which demonstrates: NFC applied to the serialization of an XML infoset is not the same as NFC applied to the text nodes and attributes of that infoset. The short answer is that X

Re: Case-folding dotted i

2013-01-29 Thread Eric Muller
On 1/24/2013 2:15 AM, Richard Wordingham wrote: If text is going to be processed, i+dot is wrong for Turkish, as the Unicode casing rules for Turkish will capitalise it to I+dot+dot, which should display with two dots. If you're going to use an explicit dot, I'd have said would be better, though

Re: Too narrowly defined: DIVISION SIGN & COLON

2012-07-11 Thread Eric Muller
On 7/11/2012 9:20 AM, Julian Bradfield wrote: Unicode is about plain text. TeX is about fine typesetting. Too narrowly defined: Unicode. I think Unicode is not just for plain text, but rather concerns itself with only the lower layer of /any /text system. When it's plain text, Unicode has t

Record-A-Thon is tomorrow – help record 50 langauges in a single day

2011-07-29 Thread Eric Muller
From http://blog.mightyverse.com/2011/06/300-languages-record-a-thon/ On July 30th, 2011 we will meet at the Internet Archive in San Francisco, where volunteers will record the Universal Declaration of Human Rights (UDHR) in their n

Re: Derived age regexp

2010-10-15 Thread Eric Muller
On 10/15/2010 3:19 PM, Tim Greenwood wrote: Is there any regular expression - in perl, or elsewhere, that enables searching on the derived age? I want to find all characters in a file added since Unicode 4.1. I could write it all by processing against the derived age file, but it would be nice

Re: OpenType update for Unicode 5.2/6.0?

2010-10-15 Thread Eric Muller
I entirely second Peter's description. Let’s keep this in perspective: consider just how much progress there has been in the last ten years. IMHO, we can all be grateful to Microsoft in that area. I don't believe any other company or group has been as instrumental in bringing real solution

Re: looks like some problem in Scripts.txt file of UCD

2010-08-13 Thread Eric Muller
looks good, but hmm its really hard to guess characters script when it will be alone. I think one need to add extra check, when character will be at initial position with property inherited Indeed, if the text is simply some base character of script Common (NBSP, dotted circle) + one of

The end of movable type in China: idsgn (a des ign blog)

2010-07-28 Thread Eric Muller
http://www.idsgn.org/posts/the-end-of-movable-type-in-china/

Re: Bengali Script

2010-07-12 Thread Eric Muller
On 7/8/2010 5:09 PM, Tulasi wrote: Ok I am correcting - "Bangladeshi" to "Bengali". The Government of West Bengal / Society for Natural Language Technology Research (a member of the Consortium) has a very strong preference for the term "Bengla" rather than "Bengali". Eric.

Re: Titlecasing iota subscript

2010-06-03 Thread Eric Muller
See also the FAQ, http://www.unicode.org/faq/greek.html#6 Eric.

NYT article: Using a New Language in Africa to Save Dying Ones

2004-11-13 Thread Eric Muller
http://www.nytimes.com/2004/11/12/international/africa/12africa.html?ex=1101365144&ei=1&en=b4b60fe9706acc9b Eric.

Re: [africa] Unicode & IDNs

2004-11-09 Thread Eric Muller
Works for me by clicking on the link in Chris's message. Mozilla 1.7 Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7) Gecko/20040616. Running on some configuration of XP SP2. The navigation bar show "http://www.ÉÉ.net". Eric.

Re: Looking for a C library that converts UTF-8 strings from their decomposed to pre-composed form

2004-11-08 Thread Eric Muller
Deborah Goldsmith wrote: It's worth pointing out that there is no such thing as "precomposed Unicode". Normalization form C (NFC) could be called "as precomposed as possible." There are some sequences of Unicode that can only be expressed using combining marks. As well as single (precomposed) c

Re: Looking for the UDHR in Thai

2004-11-03 Thread Eric Muller
Ed, thanks for the pointers. I'll be in touch with you off-list. In the mean time, I have Hindi, Sanskrit, Magahi and Bohjpuri versions at . Eric.

Looking for the UDHR in Thai

2004-11-02 Thread Eric Muller
I am using the various versions of the Universal Declaration of Human Rights at as test material. Unfortunately, the Thai version is an image, and the resolution is not good enough for me to even attempt to retype the document. Can somebody point me to eith

Re: outside decomposed, inside precomposed

2004-10-13 Thread Eric Muller
Kenneth Whistler wrote: However, implementations don't get to pick and choose so easily about aspects of the standard such as encoding forms and normalization. You can't, for example, recognize that is canonically equivalent to U+00F1 (Ã), but claim *not* to recognize that is likewise canonicall

Re: outside decomposed, inside precomposed

2004-10-13 Thread Eric Muller
Jon Hanna wrote: imported UTF-8 sequences like [U+0065][U+0303] get remapped internally to [U+1ebd] LATIN SMALL LETTER E WITH TILDE. Is this kind of behavior what one would expect? That's conformant, if it causes problems with any other process (including other processes

Re: CYRILIC CAPITAL/SMALL LETTER PE WITH DESCENDER

2004-10-11 Thread Eric Muller
I have found a couple of books on Abkhaz, and I have scanned some pages. They are at . Page 88 of the second book is a reproduction of a third book, which seems very interesting, but that I have not been able to locate. Eric.

Re: CYRILIC CAPITAL/SMALL LETTER PE WITH DESCENDER

2004-09-28 Thread Eric Muller
Michael Everson wrote: At 08:12 -0700 2004-09-28, Eric Muller wrote: It seems that Abkhaz, written in Cyrillic, uses a PE WITH DESCENDER, but I can't find this case pair in Unicode. I am missing something, or do we need to encode those? U+04A6, U+04A7 are used in Abkhaz for that sou

CYRILIC CAPITAL/SMALL LETTER PE WITH DESCENDER

2004-09-28 Thread Eric Muller
It seems that Abkhaz, written in Cyrillic, uses a PE WITH DESCENDER, but I can't find this case pair in Unicode. I am missing something, or do we need to encode those? Evidence: - Daniels and Bright, p717, table 60.15, right column, 6th entry. - Universal Declaration of Human Rights, at

Re: Saudi-Arabian Copyright sign

2004-09-20 Thread Eric Muller
D. Starner wrote: It's a simple combining character. Even if you can't do arbitrary circles around characters, you can take one character sequence and map it to the glyph in a font. Systems that can't do even that need to be fixed. This sounds nice, but in practice, things are a bit more compli

Re: valid characters in user names- esp. compatibility characters

2004-08-14 Thread Eric Muller
Tex Texin wrote: However, I am curious as to whether some Users might read/write their names using compatibility characters (esp. in ideographic markets) and object to the characters being normalized through nfkc. There is a further problem there, because the CJK compatibility characters have a

Re: Writing Tatar using the Latin script; new characters to encode?

2004-07-27 Thread Eric Muller
Mark E. Shoulson wrote: Unicode exists to support what people use. Do people use Latin script for Tatar? Evidence indicates that they do. Should Unicode support it, then? Certainly. Does Unicode support it? Yes, Unicode supports the Latin script, with gobs of extensions. So what's the pro

Re: Question on CLDR number patterns

2004-05-25 Thread Eric Muller
Mark Davis wrote: Â The decimal format looks like the following: Â #,##0.###;#,##0.###- I was actually looking the locales through the ICU explorer, which apparently replaces the localizable characters by those specified in the , hence my confusion. Â (We s

Question on CLDR number patterns

2004-05-25 Thread Eric Muller
The decimal pattern for Arabic/Kuwait contains U+0660 Ù ARABIC-INDIC DIGIT ZERO, apparently for the MinimumInteger part (using the Java DecimalFormat terminology), presumably to select the set of Arabic digits. However, this mechanism does not seem to be part of the Java patterns, so I suspect

Re: Proposal to encode dominoes and other game symbols

2004-05-25 Thread Eric Muller
Philippe Verdy wrote: A suggestion for playaing cards: why not including the "Tarots"? I mean in French the 4 "Cavaliers" figures, the 18 "Atouts", and the "Excuse" (which is not exactly a Joker); sorry I don't have their English names. Make that 21 atouts (labeled "1" through "21"), for a total o

Re: ISO 15924 French name "Gotique": a typo...???

2004-05-21 Thread Eric Muller
Michael Everson wrote: Collins-Robert Senior Dictionnaire FranÃais-Anglais Anglais-FranÃais gothique [architecture, style] Gothic. Ãcriture ~ Gothic script That means Fraktur gotique [ling] Gothic That means Wulfilan Stet. Le Petit Robert (1987) concurs with your assement: --- GOTIQUE. Voir GOTH

Writing Tatar using the Latin script; new characters to encode?

2004-05-11 Thread Eric Muller
According to , there is a currently an effort to convert the writing of Tatar from Cyrillic to Latin. 1. Does somebody have more information about that effort? Eki lists four characters as needed but missing in Unicode (see ). 2

Re: GB18030 and super font

2004-04-22 Thread Eric Muller
Raymond Mercier wrote: Mark Shoulson writes >their Super Font is bundled with Microsoft Office XP, and > even Microsoft's prices haven't gotten that high! >From Microsoft, http://www.microsoft.com/globaldev/DrIntl/columns/015/default.mspx : "A font that contains Simp

Re: GB18030 and super font

2004-04-22 Thread Eric Muller
Raymond Mercier wrote: But that link to proofing tools leads nowhere. Maybe it's not be so easy to get the CHS version. Includes ~140 fonts, mostly for CJK, Arabic, Hebrew but other scripts as well. Includes "Simsun (Founder Extended)" aka "åä-ææèååçé", with 65,531 glyph

Re: Fixed Width Spaces (was: Printing and Displaying DependentVowels)

2004-04-04 Thread Eric Muller
Kenneth Whistler wrote: Uh, no. is equivalent to . I suspect that "equivalent" is only for some aspects. In particular, NBSP has a bidi category of CS, which means that "A 07 B" (in bidi notation) displays as "B 0 7 A", while "A 07 B" displays as "B 7 0 A". Eric.

Re: Converting between Shift-JIS and Unicode

2004-04-01 Thread Eric Muller
Rick Cameron wrote: Could you please point me to information on the relationship between JIS X 0208-1990 (as represented by the kJis0 field in Unihan.txt) and Shift-JIS? The JIS X 0208 and JIS X 0213 include a description (in Japanese, but with pictures) of the relationship. They are available

Re: LATIN SMALL LIGATURE CT

2004-02-27 Thread Eric Muller
Peter Constable wrote: Adobe included this and other ligatures in their use of the PUA for their own legacy reasons; More specifically, to allow applications which do not have OpenType layout to display those ligatures. it has otherwise never been necessary for them to do so in their Pro fonts

Re: collation of small capitals

2004-01-31 Thread Eric Muller
Philippe Verdy wrote: The most common use I have seen of small capitals is as a font style, where they were used to represent lowercase letters (the uppercase letters being presented with full-height style). It is not proper to encode English, French, ..., text that is eventually rendered using

Re: Useful Breton links

2004-01-16 Thread Eric Muller
Michael Everson wrote: And Skol Diwan. These are indeed schools that teach Breton and other subjects in Breton, but they are not public schools, in the sense of being run by the state. There are "true" public schools that teach in Breton, and Div Yezh is a parent's association that promotes

Re: Useful Breton links

2004-01-16 Thread Eric Muller
I'd really like to know more about Breton, [...] it is not supported by public schools Breton is taught in public schools in France, including in bilingual programs in elementary schools (about 50 schools in 2002). Look for Div Yezh. Kenavo, Eric.

Re: OT: Free Fonts

2003-12-04 Thread Eric Muller
John Hudson wrote:  ClearType is a proprietary renderer that Microsoft don't share with anyone. Although that is changing: http://www.infoworld.com/article/03/12/03/HNmicrosoftip_1.html Towards the end: Microsoft expects most licensing arrangements to be made one-on-one with interested c

Linguistic Diversity and National Unity: Language Ecology in Thailand

2003-11-17 Thread Eric Muller
I just finished reading âLinguistic Diversity and National Unity: Language Ecology in Thailandâ by William Smalley, University of Chicago Press, ISBN 0-226-76288/9, and I found it very interesting. However, I have no reference to judge it against. Can anybody comment on it? Any significant ch

Re: Hacek - Typing from a keyboard... Help!!!!

2003-10-29 Thread Eric Muller
Rick McGowan wrote: "Caron" [...] is *NOT* in current use at all in English. It is widely used in the typography community, for better or for worse. Eric.

Re: Unicode and Script Encoding Initiative in San Jose Mercury News

2003-10-25 Thread Eric Muller
Doug Ewell wrote: [...] about "You see, boys and girls, computers think only in numbers" -- in a Silicon Valley paper, [...] Should we tell them about “real” quotes? “real quotes” are not just for Web publication; they are also for email. Throw in real dashes, of the kind – en or em – you pr

Re: Some questions about fractions

2003-09-30 Thread Eric Muller
Jill Ramonsky wrote: I'm wondering, exactly how equivalent are the following sequences: U+00BC (vulgar fraction one quarter) U+215F U+0034 (fraction numerator one; digit four) U+0031 U+2044 U+0034 (digit one; fraction slash; digit four) In particular, should they be rendered

Re: About that alphabetician...

2003-09-25 Thread Eric Muller
Michael Everson wrote: An Irish colleague here said he liked the article but noted that the Times' web directors don't use Unicode ... ... There is an alternative point of view, which says that charset declared in an HTML (or XML) document is no more than

Re: W3C Objects To Royalties On ISO Country Codes

2003-09-25 Thread Eric Muller
See also . Eric.

Re: Michael Everson in the news

2003-09-25 Thread Eric Muller
See also , which is apparently about SEI. Eric.

Re: Questions on Myanmar encoding

2003-09-24 Thread Eric Muller
Thank you very much for your help. I don't know what you mean third row of Table 10.3. It is in Unicode 4.0, section 10.3, page 273, and you can see it at: With current model.. 1018 102C 1039 101B 1031 102C 1010 102C 101C 1032 0

Questions on Myanmar encoding

2003-09-18 Thread Eric Muller
1. What is encoded by the sequence of characters U+1004 င MYANMAR LETTER NGA U+1039 ◌္ MYANMAR SIGN VIRAMA U+1004 င MYANMAR LETTER NGA is it kinzi + consonant NGA or consonant NGA+ subscript consonant NGA? Should we add some words to Table 10.3 to clarify that? 2. Does consonant + subscript cons

Re: Faulty ligatures in Adobe PhotoShop

2003-08-27 Thread Eric Muller
Doug Ewell wrote: Anto'nio Martins-Tuva'lkin wrote: The bad part of it is that the ligated characters shown (in the sencond and third examples) seem to include a long "s" instead of an "f"... > attached for reference. Thanks for the report, I’ll forward to the Photoshop gu

Re: Unicode 4.0 is online at last!

2003-08-14 Thread Eric Muller
Peter Kirk wrote: And indeed the software being used is produced by a consortium member. Perhaps the embarrassment should be more that member's, that their software is not Unicode compatible. The member in question is a company. Companies are not embarrassed nor ashamed. whilst implementin

Re: Does Unicode 3.1 take care of all characters of 'Hong Kong SupplimentaryCharacter Set - 2001' (HKSCS-2001) ?

2003-08-04 Thread Eric Muller
John McConnell wrote: The mapping of the HKSCS 2001 repertoire to ISO/IEC 10646-2:2001 has 35 mapped to the private use area 1651 mapped to supplementary plane 2 511 mapped to the Extension A block (on the BMP) 2212 mapped to the CJK Ideographic block (also on the BMP) plus another 278 mapped e

Re: From [b-hebrew] Variant forms of vav with holem

2003-07-30 Thread Eric Muller
Mark Davis wrote: The UTC accepts and considers proposals from other parties (see http://www.unicode.org/pending/proposals.html for submitting a proposal for new characters). For complex matters (which this definitely seems to be, based on the volumn of mail!), it is far and away the best if som

Re: U+23D0 VERTICAL LINE EXTENSION

2003-07-24 Thread Eric Muller
Alan Wood wrote: I think this leaves only one character in the old Symbol font that does not have a Unicode equivalent: RADICAL EXTENDER (decimal 96 in the Windows version) When I prepared the proposal for U+23D0 ⏐ VERTICAL LINE EXTENSION, it was indeed to ensure the complete representation

Re: U+1D29

2003-05-30 Thread Eric Muller
Anto'nio Martins-Tuva'lkin wrote: I've just downloaded the PDF files with 4.0 additions (U40-*.pdf). One question: How is one supposed to tell apart the glyphs for U+1D29 and U+1D18?... Or one isn't?... In the same way that you tell apart the glyphs for U+0050 P LATIN CAPITAL LETTER P and U+

Re: ISO 8859_2 and Windows 1250

2003-03-12 Thread Eric Muller
Otto Stolz wrote: CP 1250 contains the ISO 8859-1 characters, hence it is not suited for slavic laguages. I suspect that Otto meant to type "CP 1252 contains..." Eric.

Re: Handwritten EURO sign

2003-02-07 Thread Eric Muller
The latest issue of Baseline (www.baselinemagazine.com) has an article on the Euro. I did not read it, so I don't know if it speaks of handwritten forms. Sign of the times: the euro currency symbol by Conor Mangat. Eric.

Re: 4701

2003-02-01 Thread Eric Muller
Michael Everson wrote: Happy New Year of the Yáng to everybody! (I can't work out whether it's the Year of the Sheep, the Goat, or the Ram.) Ram. Eric.

Re: urban legends just won't go away!

2003-01-31 Thread Eric Muller
Barry Caplan wrote: Who knew in this day and age flipping bits to change case is still publishable (this is from today!) What I find a lot more objectionable is that what this code pretends to do is not defined (in particular, the domain over which it applies). Without such qualification,

Re: Oh No! Not a new Adobe Glyph List!!!

2003-01-02 Thread Eric Muller
The owner of the Adobe Glyph Naming conventions does not read this mailing list, and the subject is only loosely related to Unicode. You may want to inquire on the opentype mailing list ([EMAIL PROTECTED]), Eric.

Re: Oh No! Not a new Adobe Glyph List!!!

2003-01-02 Thread Eric Muller
Rick McGowan wrote: The AFII standard is not only obsolete, there isn't even any publicly available reference for the numbers. What is the relationship between AFII and ISO/IEC 10036 (Information technology -- Font information interchange -- Procedures for registration of font-relat

Re: Documenting in Tamil Computing

2002-12-17 Thread Eric Muller
I don't understand what you meant by Unicode not being mature enough to support multilingual emails. Maybe the argument is simply that there are not enough email agents that can render Tamil properly from Unicode-encoded text, and that email rarely has a useful life that justifies pain today.

Re: converting devanagari to mangal unicode

2002-12-16 Thread Eric Muller
In order to convert any Devanagari font to be rendered in the same way, May be Sunil is just asking for a conversion of data, presumably from ISCII to Unicode. Eric.

Re: IPA for "hard g"

2002-12-15 Thread Eric Muller
Doug Ewell wrote: I didn't know (and had not checked) whether the Handbook was available to non-members of the Association. ISBN 0-521-63751-1. Cambridge University Press. List price is $18 in the US. Available through Amazon and such. Eric.

entering JIS 0213, HKSCS and GB 18030 characters

2002-10-30 Thread Eric Muller
We have a very hard time assembling the following information: on MacOS X and Windows XP, how do users practically enter JIS 0213, HKSCS and GB 18030 characters? We are interested by both OS provided IMEs and third party IMEs. Of course, we are interested in the more "recent" characters in thos

TDIL information on Indic languages.

2002-09-26 Thread Eric Muller
This may be of interest for people working with Indic languages. Eric. Original Message Subject: [li18nux:1096] Re: Linux Future Survey Date: Thu, 26 Sep 2002 16:40:36 +0530 From: "Dutta Abhijit" <[EMAIL P

[OT] looking for electronic dictionaries

2002-08-29 Thread Eric Muller
For my personal use, I would like to acquire electronic dictionaries, principally for the major European languages, with the following characteristics: - reputable source - "raw" datafiles accessible - I appreciate the interfaces that dictionary vendors may provide, but I want to be able to w

OCR characters

2002-08-15 Thread Eric Muller
In our OCR fonts, we have two glyphs named "erase" (looks like a black square) and "grouperase" (looks like a long dash). I don't have a copy of the OCR standards, but I suspect those are mandated by these standards. On the other hand, and I can't find traces of those in Unicode, so I suspect

Re: New version of TR29:

2002-08-15 Thread Eric Muller
> > Your definition of "LatinVowel" is problematic. Is "Y" only a vowel in > > French? In a word such as "yeux", it certainly is a consonant. Could > > this lead to problems? > > I don't think so, but I wait for the opinion of French speakers. > > What I can see is that things like "l'yaou

Re: "Missing character" glyph- example

2002-08-01 Thread Eric Muller
John Hudson wrote: > but it should *not* be encoded as U+ or as any other codepoint. > .notdef should be unencoded. Almost. OpenType specifies that there is no functional difference between a code point that is not mapped and a code point that is explicitly mapped to GID 0, so there is

  1   2   >