Re: Application that displays CJK text in Normalization Form D
All Cocoa/Cocoa Touch apps display them correctly. Aki Inoue On 2010/11/13, at 17:07, Bill Poser wrote: > > > On Sat, Nov 13, 2010 at 4:46 PM, Jim Monty wrote: > Is there even a single software application that properly displays CJK text in > Normalization Form D? > > > I just tried your examples in Yudit (http://www.yudit.org) and they seem to > work: the NFD text looks the same as the NFC text. >
Re: Problem display ALL Chinese Characters?
Chang, 3. Convert simplified Chinese characters to something like "\u00e7\u2030\u02c6\u00e6\ufffd\u0192\u00e7\u00bb\u0153\u00e6\u02dc\u01 7d\u00e7\u00bd\u2018\u00e7\u00bb\u0153" using native2ascii tool provided by sun. native2ascii seems to be mapping your simplified chinese source file using cp1252 instead of cp936. You can specify the encoding for the command like: native2ascii -encoding windows-936 Aki On 2002.12.10, at 03:18 PM, Chang Liu wrote: Hi, I am trying to display Chinese character (Simplified) in Java swing app on English windows 2000 OS. But I am having trouble displaying all the Chinese characters (only some characters are displayed). Here is the info. about my app: 1. Java Swing app. 2. Windows 2000 OS. 3. Convert simplified Chinese characters to something like "\u00e7\u2030\u02c6\u00e6\ufffd\u0192\u00e7\u00bb\u0153\u00e6\u02dc\u01 7d\u00e7\u00bd\u2018\u00e7\u00bb\u0153" using native2ascii tool provided by sun. 4. The app. reads above codes and displayed partial Chinese characters (Dialog font). 5. I tried some other Chinese characters as well, it can display some as well. Can anyone help me on this? It seems that the simplified Chinese character set I am using does not include all the characters. And if it is so, how can I add more? thanks! Chang Liu
Re: Summary of Unicode/language features in Mac OS X 10.2 "Jaguar"
John, Since you're dealing with Polytonic Greek, I suppose your original file is a plain text Unicode. First of all, the issue with TextEdit showing GB 18030 over Unicode seems to be a bug in the application. The logic here is simply selecting a character encoding that can represent all the characters in the document. It should always select Unicode over GB 18030 for user's sake. If GB 18030 annoys you much, you can remove the encoding from the menu. For the mixed-font selection. The Cocoa Text System tries very hard to "honor" the original font user selected. In this case opened from a plain text file, the font is Monaco (you can change the setting in TextEdit's preference panel). From there, Cocoa looks for the font character by character preferring the system supplied fonts. As you discovered, you can always change the font and save as rich text that preserves the font setting. Aki Inoue Object App Framework Apple Inc On 2002.8.27, at 07:39 PM, John Delacour wrote: > On Sat Aug 24 2002 - 13:18:08 EDT Deborah Goldsmith wrote: > >> - Keyboards may be installed by dragging in the Finder to >> /Library/Keyboard Layouts/, ~/Library/Keyboard Layouts/, or >> /Network/Library/Keyboard Layouts/, then logging out and logging back >> in. > > This is excellent news. I have just succeeded, with the invaluable > guidance of Alex Eulenberg, in installing a keyboard for polytonic > Greek that I have been developing and using, until today (when I > installed Jaguar), in OS 9 and in 10.1.5. This new facility makes it > simple for anyone with no technical knowledge to install a ready-made > keyboard layout. > >> - Fonts to support Arabic, Hebrew, Cyrillic, Devanagari, Gujarati, >> Gurmukhi, Polytonic Greek, and Thai (some of these are optional >> installs) > > Hmm. I wonder if this is what is causing problems. I have three or > four third party fonts to support polytonic Greek, including Arial > Unicode MS, which I placed in ~/Library/Fonts/ straight after > installing Jaguar. I have had no problem working in pGreek in MacOS 9 > and 10.1.5. Until today it was possible to work on the same document > either in WorldText in Classic or in TextEdit and it was even possible > to drag pGreek text from WorldText to TextEdit. Text in a document > was displayed faultlessly in a single font. > > Now, in Jaguar, if I open a plain text pGreek document in TextEdit, > about half the document is displayed in a clear font with apparently > only the pi (as always) being borrowed from the ascii set, and the > rest in a mixture of fonts. Literally as I was writing this, just > such a document in the background caused the machine to lock me out > with a rainbow wheel in order to re-display itself in a quite > different mixture of Greek fonts. If I select a characters I find the > pi is in Monaco from the MacRoman set, others are in Lucida Grande, > others in Hiragino Kaku Gothic, some in Caslon! This happens no > matter how I set my preferences. > > > If I create a new document, type in a few characters of polytonic > Greek and then go to save the document, the default encoding option > offered me is Chinese (GB 18030)! so it clearly has no idea what is is > meant to be, and until I force the document to a single font, it > remains a mess of bits and pieces from all over the place -- all > Greek, but all from different fonts. So long as the document remains > plain text, it will open every time as the mish-mash I have described. > > I'm sure there is a good explanation for this and I can see that 10.2 > is a great leap forward, not least in the Unicode department. I'd be > happy to send you samples and screen shots to show the problem. The > fact is that on the face of things, the situation for polytonic Greek > was excellent before Jaguar and extremely confused now for this user. > > JD > >
Re: Normalisation and font technology
John, Let me add a few more points to John Jenkins' comment. As John mentioned, we're working on to add the character space processing capability to the OS. In fact, Cocoa framework, one of the two primary APIs in Mac OS X, can already handle most of the combining marks pretty reasonably well without the help from AAT tables. Please note neither glyph space only nor character space only solution cannot fully embrace the flexibility of Unicode. It needs to be hybrid. By being glyph space only solution, you lose the character semantics defined in Unicode in typesetting. Whereas, by being character space only solution (meaning applying NFC before rendering), you cannot render arbitrary sequences of composed characters allowed by Unicode. As the Cocoa framework currently does, you could position combining characters without precomposition by just looking at the combining class for most cases. However, it's still desirable fonts to have positioning information for better typographical result since the font designers themselves know best about glyphs they're designing. Aki On 2002.05.29, at 11:29, John Hudson wrote: > At 10:24 5/29/2002, John H. Jenkins wrote: > >>> In particular, I think it is is mistake to resolve display of >>> character-level decompositions by relying on the presence of >>> glyph-space substitution or positioning features in fonts, simply >>> because most users have very few fonts that are capable of doing this. >> >> Agreement; Apple's current solution is a "better-than-nothing" one, >> but not really what's best in the long run IMHO. BTW, does FontLab 4 >> auto-generate OT layout data from the Unicode repertoire of a font? > > It could be made to fairly easily using existing functions and Python > scripting, but it isn't a built-in automatic feature. > > There are, however, architectural reasons why some layout data that you > are putting into AAT fonts is deliberately absent from OpenType. There > are currently no OpenType features for specifically handling canonical > composition or decomposition of glyphs representing Unicode strings*, > and I don't think such features would get very far if one proposed them > to Microsoft and Adobe. OpenType tries to maintain a clear distinction > between what should be handled in character space and what should be > handled in glyph space, whereas AAT is content to handle pretty much > everything in glyph space. The impression I have from discussions with > people in the type groups at Microsoft and Adobe is that they are > agreed that canonical decomposition and its resolution for display is > something that should happen at the character level, prior to any glyph > processing. Because of the architectual principles of OpenType, Apple's > 'better-than-nothing' approach as you describe it, would more likely be > seen as 'nothing is better'. I get the impression than Apple are > willing to release temporary measures while working on better long term > solutions, while Microsoft prefer to wait until the long term solution > is ready. Either approach can be valid, but Apple's is facilitated in > this instance by the fact that they have a font architecture in which > doing character level processing in glyph space is acceptable. > > > * OpenType has a slightly misnamed Character Composition/Decomposition > feature (it is actually a glyph composition/decomposition > feature), which enables font developers to make decisions about how > best to handle display of individual typeforms. But this is not limited > to, or even appropriate for, resolving canonical character > decomposition, since it can be used to decompose or compose any glyph > in a font. In a single font, this feature might be used to compose some > glyphs (e.g. representing the Hebrew hataf qamats and meteg marks as a > single glyph) and decompose others (e.g. decomposing the Arabic alif > with hamza in order to take advantage of coloured diacritic marks in > Word). > > John Hudson > > Tiro Typeworkswww.tiro.com > Vancouver, BC [EMAIL PROTECTED] > > When the pages of books fall in fiery scraps > Onto smashed leaves and twisted metal, > The tree of good and evil is stripped bare. >- Czeslaw Milosz > >
Re: Japanese Word2000 question
Hi Rick, You can get 'zu' by typing 'du' with MS-IME. Aki On Friday, 7 13, 2001, at 12:40 PM, Rick McGowan wrote: Speaking of all this UTF-8 & mojibakes etc, Here's a question for the Japanese speakers & users of Word 2000... I'm using Word2k on Win98. How do you input the syllables U+3065 and U+30C5 with the Japanese Global IME? I.e, I want the "zu" syllables obtained by adding dakuten to "tsu" rather than "su". All romaji inputs of "zu", "ZU" etc give me U+305A and U+30BA. I can't seem to get the ones I want. This is for a particular use where the others won't do. Rick
Re: Mac support of UCAS in Unicode 3.0
OmniWeb is one of few Web browsers that display UTF-8 encoded Web pages. As long as you have UCAS font with Unicode cmap, you should be able to display it with the browser. Aki Inoue Apple Computer Inc. Object App Framework - Original Message - From: "Rick McGowan" <[EMAIL PROTECTED]> To: "Unicode List" <[EMAIL PROTECTED]> Sent: Thursday, October 26, 2000 9:28 PM Subject: Re: Mac support of UCAS in Unicode 3.0 > The only exception I am aware of to this rule is > the OmniWeb application which runs only on Mac OS X. One dis-advantage of OmniWeb, by the way, for international use, is that it requires that you set (in a preference panel) the encoding it uses for pages that it renders; it doesn't know about looking at the HTML meta tags for encoding. It knows about the more-or-less complete Mac codeset repertoire; UTF-8 is among them, but I think no other UCAS-capable encoding is available. Rick